Huffman compressionAlso known as Huffman encoding, an algorithm for the lossless compression of files based on the frequency of occurrence of a symbol in the file that is being compressed. The Huffman algorithm is based on statistical coding, which means that the probability of a symbol has a direct bearing on the length of its representation. The more probable the occurrence of a symbol is, the shorter will be its bit-size representation. In any file, certain characters are used more than others. Using binary representation, the number of bits required to represent each character depends upon the number of characters that have to be represented. Using one bit we can represent two characters, i.e., 0 represents the first character and 1 represents the second character. Using two bits we can represent four characters, and so on.
Unlike ASCII code, which is a fixed-length code using seven bits per character, Huffman compression is a variable-length coding system that assigns smaller codes for more frequently used characters and larger codes for less frequently used characters in order to reduce the size of files being compressed and transferred.
For example, in a file with the following data:
the frequency of "X" is 6, the frequency of "Y" is 4, and the frequency of "Z" is 2. If each character is represented using a fixed-length code of two bits, then the number of bits required to store this file would be 24, i.e., (2 x 6) + (2x 4) + (2x 2) = 24.
If the above data were compressed using Huffman compression, the more frequently occurring numbers would be represented by smaller bits, such as:
X by the code 0 (1 bit)
Y by the code 10 (2 bits)
Z by the code 11 (2 bits)
therefore the size of the file becomes 18, i.e., (1x 6) + (2 x 4) + (2 x 2) = 18.
In the above example, more frequently occurring characters are assigned smaller codes, resulting in a smaller number of bits in the final compressed file.
Huffman compression was named after its discoverer, David Huffman.
IT Solutions Builder TOP IT RESOURCES TO MOVE YOUR BUSINESS FORWARD
Which topic are you interested in?
What is your company size?
What is your job title?
What is your job function?
Searching our resource database to find your matches...
Stay up to date on the latest developments in Internet terminology with a free weekly newsletter from Webopedia. Join to subscribe now.
The following facts and statistics capture the changing landscape of cloud computing and how service providers and customers are keeping up with... Read More »SEO Dictionary
From keyword analysis to backlinks and Google search engine algorithm updates, our search engine optimization glossary lists 85 SEO terms you need... Read More »Texting & Chat Abbreviations
From A3 to ZZZ this guide lists 1,500 text message and online chat abbreviations to help you translate and understand today's texting lingo. Read More »
Java is a high-level programming language. This guide describes the basics of Java, providing an overview of syntax, variables, data types and... Read More »Java Basics, Part 2
This second Study Guide describes the basics of Java, providing an overview of operators, modifiers and control Structures. Read More »Network Fundamentals Study Guide
Networking fundamentals teaches the building blocks of modern network design. Learn different types of networks, concepts, architecture and... Read More »