もっと詳しく

What is the best compression software in the world? WeChat.

Many people must have heard this phrase. A picture of several megabytes is immediately reduced to several hundred kilobytes by WeChat.

△ If it is lossy compression, the image quality will be degraded (the sky on the right has ripples)

Although this is a rant, but u1s1, picture and video compression is actually a very necessary technology.

For example, when a video call or a large number of pictures is transmitted, if it is not compressed, either the picture cannot be transmitted at all, or it is just waiting.

Therefore, in the past few decades in the digital age, many related technologies have emerged, such as JPEG and H.26X.

But you may not know that these technologies can be traced back to 47 years ago.

There are three little-known Indian engineers who “go their own way” and use the summer vacation time to tinker with a technology without applying for research funding, which later directly became the industry standard for image and video compression.

It is DCT.

The full name is Discrete Cosine Transform, which is discrete cosine transform.

What’s interesting is that when DCT was born, even the author himself did not expect that it would have such a huge influence later.

Without DCT, there is no JPEG/MPEG

Directly speaking, many people may not know what DCT is, but everyone must have heard of JPEG.

In addition to being a common image file suffix, it is also a lossy compression standard, which can change a picture from the left to the right:

ps. The difference between lossy and lossless: lossless compression can restore 100% of the image; lossy cannot, but the size of the image after lossy compression will be greatly reduced.

DCT is a basic technology to realize this process.

It is a kind of Fourier transform, which can convert the image from the spatial domain to the frequency domain, that is, to convert the image from a pixel matrix to a function with information such as frequency.

For the specific transformation process, we take a 3×3 pixel block in an image as an example:

△ Picture source blog garden blogger @Silent Back X-Pacific

Doing DTC transformation on this pixel block is equivalent to extracting part of the information of other pixels except the first pixel into the first grid.

In this way, the pixel value of the first grid represents the overall appearance of a picture, which is called low-frequency information; the remaining grids represent the details of people or objects in the image, which is called high-frequency information.

After DCT conversion, each 3×3 pixel block produces 1 DC (direct current) coefficient (in the first grid) and 8 AC (alternating current) coefficients (remaining grid), the former being the most important output of the DCT.

Since most of the image energy will be concentrated in the low frequency part, the output DC coefficient value after conversion is relatively large, while the output AC coefficient value is relatively small.

Using the principle that “the human eye is more sensitive to images of low-frequency components than images of high-frequency components”, and then saves the low-frequency components through quantization, discards the high-frequency components (changes most of the AC coefficient values ​​to 0), and discards those pairs. Visual effects affect little information, so as to achieve the purpose of compression.

From the 3D projections of the two images below, we can see the changes brought about by the DCT transform:

(Top: original image; bottom: after DCT transformation)

In the actual JPEG compression standard, an image is divided into several 8×8 pixel blocks (not enough to be filled with blanks).

After converting the color space from RGB to YUV, DCT transform each block from left to right and top to bottom.

The transformed coefficients of each block are then quantized. During this process, some important components are removed and cannot be recovered.

Therefore, this is an irreversible lossy compression technique.

Then encode the AC coefficients and DC coefficients obtained after quantization, and get the following large series of numbers after Huffman encoding.

The complete image can be reconstructed by performing inverse DCT transform (IDCT) on each image block during decompression.

The specific calculation process is as follows:

First, the original grayscale and brightness values ​​of each pixel in the picture are represented by 8bit, which is the range of (0, 255).

Since most values ​​will be distributed around 128, we will subtract 128 from these values, so that there will be more values ​​0, which is good for compression, and the range becomes (-128, 127).

Then use the DCT transformation formula to transform, two-dimensional use this:

After the transformation, quantization is performed according to the quantization table, and most of the coefficients are changed to 0 to complete the compression.

ps. The quantization table is determined according to the visual threshold of the quantization error of the human eye, and there is a fixed table.

The latter is a series of encoding processes mentioned above.

The technique was first published in IEEE Transactions on Computers in January 1974.

Since then, the industry standard in image and video compression has been born.

The world’s first video compression standard H.261 in 1998, JPEG and MPEG in 1992, WebP in 2010, HEIF in 2013, AV1 jointly created by companies such as Google Amazon in 2018, etc. Compression standards are based on this technology , and has been used to this day.

An unknown inventor for over 40 years

There are three authors of DCT, namely Nasir Ahmed (Nasir Ahmed), KR Rao (KR Rao) and T. Natarajan (T. Natarajan).

Nasir is Professor Emeritus in the Department of Electrical and Computer Engineering at the University of New Mexico.

Born in Bangalore, India in 1940, he received his Ph.D. from the University of New Mexico in 1966.

He was chief engineer at Honeywell Corporation from 1966-1968 and a professor at Kansas State University from 1968-1983.

From 1983-2001, he returned to the University of New Mexico as Chair Professor of Electrical and Computer Engineering. During this period, he successively held the positions of dean of the department and dean of the graduate school.

This year, Nasir is 82 years old.

Another lead author is KR Rao.

He is also an Indian-American scholar.

In 1960, he received a doctorate in nuclear engineering from the University of Florida. In 1966, he received a doctorate in electrical and computer engineering from the University of New Mexico.

For the next 50 years, he worked at the University of Texas at Arlington as a professor of electrical engineering.

At the same time, he is an IEEE Fellow.

On January 15, 2021, Professor Rao passed away at the age of 89.

T. Natarajan was a doctoral student under Nasir at the time, and now there is not much information about him on the Internet.

It can be said that compared to the famous DCT, several inventors can be called “unknown”.

In fact, for more than 40 years, the behind-the-scenes story of the invention of the DCT has remained largely unheard of.

Even Nasir’s son said, “I never imagined the influence of my father would be so great.”

And what pushed Nasir from behind the scenes to the front of the stage was also thanks to a wave of tributes in an American drama.

In 2020, there is a plot in “Our Lives” in which Nasir told the story of his love with his wife through a video call.

The filmmakers said that the original intention of designing this bridge segment is to hope that more people will realize that now we can quickly send pictures and videos through the Internet, which is inseparable from Nasir’s work.

After the plot was broadcast, many media defined DCT as an “algorithm that changes the world”, also saying that Nasir, a little-known engineer, was finally pushed to the front of the stage from behind the scenes.

However, Nasir said in his recollection video that he really didn’t expect DCT to have such a big impact.

I also can’t predict the speed of technology development, and I am very surprised by the emergence of applications such as FaceTime.

△ Nasir when he was young (left)

You know, DCT may have been nearly killed in the cradle at first.

In 1972, Nasir, who had already conceived the idea of ​​DCT at that time, submitted an application to the National Science Foundation (NSF), hoping that NSF would provide financial support for his research on DCT.

To Nasir’s surprise, however, the application was immediately killed, and the reviewer’s opinion was “it’s too simple.”

Fortunately, Nasir did not give up, he always felt that this idea was very innovative.

The only thing that worries him is that he may only be able to use vacation time to complete DCT related work, and he may not have any income during this period.

So Nasir went home and said to his wife:

I have a gut feeling that this is worth doing. It’s just that we need to plan how to spend an unpaid summer vacation.

His wife supported him without any hesitation.

So, in the summer of 1973, the research work of DCT officially started.

Also involved in the study were Nasir’s friend Rao and doctoral student Natarajan.

Rao was also one of the key figures supporting Nasir’s research on DCT.

After Nasir’s application was rejected, he immediately told his friend Rao what he thought.

Rao gave this reply:

You are to publish these results in short form immediately.

This is how “How I Came Up with the Discrete Cosine Transform” was born.

Later, this article is almost a must-read in the field of image and video compression.

The story after that is what we know.

In 1974, “Discrete Cosine Transform” was published in IEEE Transactions on Computers.

So far, this article has been cited 5,878 times.

Nasir once said in an interview that the greatest gift in his life is people’s recognition of DCT.

Reference link:

[1]https://spectrum.ieee.org/krao-tribute

https://www.islamicity.org/80703/nasir-ahmeds-algorithm-that-transformed-the-world/

[2]https://cloud.tencent.com/developer/article/1862531

[3]https://mp.weixin.qq.com/s?__biz=MzU1NTEzOTM5Mw==&mid=2247512538&idx=1&sn=57f46386002cf5554681f8ef9f61a3e0&chksm =fbda19f4ccad90e219bf224db522e9999086dff886bae09562e1aeba4450d4ba0247a73c3138&scene=21#wechat_redirect

[4]https://blog.csdn.net/freee12/article/details/109953732

[5]https://blog.csdn.net/weixin_52779958/article/details/124413405

[6]https://www.youtube.com/watch?v=I9VXaVVs7W

.
[related_posts_by_tax taxonomies=”post_tag”]

The post Three Indians changed the compression algorithm and insisted on going the whole summer vacation, but they couldn’t apply for funding because of “too simple” – Programmer Sought appeared first on Gamingsym.