What is the best compression software in the world?
WeChat.
Many people must have heard this phrase.
A picture of several megabytes is immediately reduced to several hundred kilobytes by WeChat.
△If it is lossy compression, the image quality will decrease (the sky on the right has ripples)
Although this is a rant, but u1s1, picture and video compression is actually a very necessary technology.
For example, when a video call or a large number of pictures is transmitted, if it is not compressed, either the picture cannot be transmitted at all, or it is just waiting.
Therefore, in the past few decades in the digital age, many related technologies have emerged, such as JPEG and H.26X.
But you may not know that these technologies can be traced back to 47 years ago.
There are three little-known Indian engineers who “go their own way” and use the summer vacation time to tinker with a technology without applying for research funding, which later directly became the industry standard for image and video compression.
It is DCT.
The full name is Discrete Cosine Transform, which is discrete cosine transform.
What is interesting is that when DCT was born, even the author himself did not expect that it would have such a huge influence later.
Without DCT, there is no JPEG/MPEG
Directly speaking, many people may not know what DCT is, but everyone must have heard of JPEG.
In addition to being a common image file suffix, it is also a lossy compression standard, which can change a picture from the left to the right:
ps. The difference between lossy and lossless: lossless compression can restore 100% of the image; lossy cannot, but the size of the image after lossy compression will be greatly reduced.
DCT is a basic technology to realize this process.
It is a kind of Fourier transform, which can convert the image from the spatial domain to the frequency domain, that is, to convert the image from a pixel matrix to a function with information such as frequency.
For the specific transformation process, we take a 3×3 pixel block in an image as an example:
△ Picture source blog garden blogger @Silent Back X-Pacific
Doing DTC transformation on this pixel block is equivalent to extracting part of the information of other pixels except the first pixel into the first grid.
In this way, the pixel value of the first grid represents the overall appearance of a picture, which is called low-frequency information; the remaining grids represent the details of people or objects in the image, which is called high-frequency information.
After DCT conversion, each 3×3 pixel block will generate 1 DC (direct current) coefficient (in the first grid) and 8 AC (alternating current) coefficients (remaining grid), the former is the most important output of DCT.
Since most of the image energy will be concentrated in the low frequency part, the output DC coefficient value after conversion is relatively large, while the output AC coefficient value is relatively small.
Using the principle that “the human eye is more sensitive to images of low-frequency components than images of high-frequency components”, and then save the low-frequency components through quantization, discard the high-frequency components (change most of the AC coefficient values to 0), and discard those pairs of Visual effects affect little information, so as to achieve the purpose of compression.
From the 3D projection of the two images below, we can see the changes brought about by the DCT transform:
(Top: original image; bottom: after DCT transformation)
In the actual JPEG compression standard, an image is divided into several 8×8 pixel blocks (not enough to be filled with blanks).
After converting the color space from RGB to YUV, DCT transform each block from left to right and top to bottom.
The transformed coefficients of each block are then quantized. During this process, some important components are removed and cannot be recovered.
Therefore, this is an irreversible lossy compression technique.
Then, the AC coefficients and DC coefficients obtained after quantization are coded separately, and after Huffman coding, a large series of numbers such as the following are obtained.
The complete image can be reconstructed by performing inverse DCT transform (IDCT) on each image block during decompression.
The specific calculation process is as follows:
First, the original grayscale and brightness values of each pixel in the picture are represented by 8bit, which is the range of (0, 255).
Since most of the values will be distributed around 128, we will subtract 128 from these values, so that there will be more values 0, which is good for compression, and the range becomes (-128, 127).
Then use the DCT transform formula to transform, two-dimensional use this:
After the transformation, quantization is performed according to the quantization table, and most of the coefficients are changed to 0 to complete the compression.
ps. The quantization table is determined according to the visual threshold of the quantization error of the human eye, and there is a fixed table.
The latter is a series of encoding processes mentioned above.
The technique was first published in IEEE Transactions on Computers in January 1974.
Since then, the industry standard in image and video compression has been born.
The world’s first video compression standard H.261 in 1998, JPEG and MPEG in 1992, WebP in 2010, HEIF in 2013, AV1 jointly created by companies such as Google and Amazon in 2018… and other compression standards are based on this technology. , and has been used to this day.
The inventor who has been unknown for more than 40 years
There are three authors of DCT, namely Nasir Ahmed (Nasir Ahmed), KR Rao (KR Rao) and T. Natarajan (T. Natarajan).
Nasir is Professor Emeritus in the Department of Electrical and Computer Engineering at the University of New Mexico.
Born in Bangalore, India in 1940, he received his Ph.D. from the University of New Mexico in 1966.
He was chief engineer at Honeywell Corporation from 1966-1968 and a professor at Kansas State University from 1968-1983.
From 1983-2001, he returned to the University of New Mexico as Chair Professor of Electrical and Computer Engineering. During this period, he successively held the positions of dean of the department and dean of the graduate school.
This year, Nasir is 82 years old.
Another lead author is KR Rao.
He is also an Indian-American scholar.
In 1960, he received his Ph.D. in nuclear engineering from the University of Florida. In 1966, he received a Ph.D. in electrical and computer engineering from the University of New Mexico.
For the next 50 years, he worked at the University of Texas at Arlington as a professor of electrical engineering.
At the same time, he is an IEEE Fellow.
On January 15, 2021, Professor Rao passed away at the age of 89.
T. Natarajan was a doctoral student under Nasir at the time, and now there is not much information about him on the Internet.
It can be said that compared to the famous DCT, several inventors can be called “unknown”.
In fact, for over 40 years, the behind-the-scenes story behind the invention of the DCT has remained largely unnoticed.
Even Nasir’s son said, “I never imagined the influence of my father would be so great.”
And what pushed Nasir from behind the scenes to the front of the stage was also thanks to a wave of tributes in an American drama.
In 2020, there is a plot in “Our Lives” in which Nasir told the story of his love with his wife through a video call.
The filmmakers said that the original intention of designing this bridge segment is to hope that more people will realize that now we can quickly send pictures and videos through the Internet, which is inseparable from Nasir’s work.
After the plot was broadcast, many media defined DCT as an “algorithm that changes the world”, and also said that Nasir, a little-known engineer, was finally pushed to the front of the stage from behind the scenes.
However, Nasir said in his recollection video that he really didn’t expect DCT to have such a big impact.
I also can’t predict how fast technology will evolve, and I’m amazed at the emergence of apps like FaceTime.
△ Nasir when he was young (pictured left)
Be aware that DCT may have come close to being smothered in the cradle at first.
In 1972, Nasir, who had already conceived of DCT at the time, submitted an application to the National Science Foundation (NSF), hoping that NSF would provide him with financial support for his research on DCT.
To Nasir’s surprise, however, the application was immediately killed, and the reviewer’s opinion was “it’s too simple.”
Fortunately, Nasir did not give up, he always felt that this idea was very innovative.
The only thing that worries him is that he may only be able to use vacation time to complete DCT-related work, and he may not have any income during this period.
So Nasir went home and said to his wife:
I have a gut feeling that this is worth doing. It’s just that we need to plan how to spend an unpaid summer vacation.
His wife supported him without any hesitation.
So, in the summer of 1973, the research work of DCT officially began.
Also involved in the study were Nasir’s friend Rao and doctoral student Natarajan.
Rao was also one of the key figures who supported Nasir’s research on DCT.
After Nasir’s application was rejected, he immediately told his friend Rao what he thought.
Rao gave this reply:
You are to publish these results in short form immediately.
This is how “How I Came Up with the Discrete Cosine Transform” was born.
Later, this article is almost a must-read in the field of image and video compression.
The story after that is what we know.
In 1974, “Discrete Cosine Transform” was published in IEEE Transactions on Computers.
So far, this article has been cited 5,878 times.
Nasir once said in an interview that the greatest gift in his life is people’s recognition of DCT.
Hashtag: Indian Compression Algorithms
.
[related_posts_by_tax taxonomies=”post_tag”]
The post Three Indians Changed Compression Algorithms Willing to Go One Way Without It There Would Be No JPEG – Compression, India, Algorithms appeared first on Gamingsym.