Huffman Coding

Method:

The Huffman coding which implies the greedy algorithm has its theory:

If you have a set of characters , and called it $C$ , has several characters which may or may not overlap each other.

Then lets say $C$ contains these characters: f,e,d,c,b,a

each character has different frequency which we will called it $F$ . Now we have a sequence like: f(5) ,e(9), d(16), c(12),b(13),a(45)

Now if we want to code these characters in order to get a better compression of the space we stored it , we have two choices

  1. we just simply code it into same length code
  2. we use the greedy method, which means we do our best to put the higher frequency character into short codes and lower frequency characters into the longer codes As these kind of coding can be describe as a tree , and denote as follows:

$B(T)=\sum_{c\in C }{f(c)d_t(c)}$

$B(T)$: the final coding length

$f(c)$: the frequency of the character

$d_T(c)$: the depth of character in the tree, which also means the code length

For case 1 and 2, we will show in graph a and b

img

apparently case 1 has larger $B(T)$ than case 2.

Prove of greedy is the optimized solution for huffman

# ,
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×