^{1,4}Department of Computer Science & Engineering, International Islamic University Chittagong, Bangladesh
Keywords: Tree Compression; Static Huffman Code; Data Structures; Complete Binary Tree;
Compression is accomplished by using algorithms or formulas which describes the shrinking procedure of the size of data [10, 11]. The prime performance metric of data compression is compression ratio [12]. Data compression can be lossless or loss. Lossless compression occupies the restitution of a file to its exact status, without having loss of a single bit of data when the file is uncompressed. For example text compression, spreadsheet files etc. are lossless technique as because loss of a single character, numbers, or words may convert a message into completely different meaning.
The key objective of this research is to invent a technique to compress the tree size of Huffman compression by a memory efficient representation of Huffman tree [8, 9]. Significant improvements of the tree compression ratio will enhance the data compression ratio. Then designing encoding and decoding algorithm for new technique so that encoding and decoding is accomplished with minimal effort.
In this paper, we’ve proposed a new encodingdecoding technique to compress the Huffman tree size in an efficient manner and compared the performances (with respect to compression ratio, savings bits etc.) with the previous works. We also have investigated the limitations of some previous works. The proposed method is also a bit representation technique.
The organization of residual of the paper is as follows. Literature Review is explained in section 4. Methodology is described in section 5. Comparison and Results are discussed in the section 67. Limitations and scope of future work are depicted in the next section. Finally, we conclude the paper in section 9.
In reference [2], the authors Choudhury and Kaykobad proposed a new data structure for Huffman tree representation in which in addition to sending symbol codes, codeword for all the circular leaf nodes are sent. They decoded the text by using the memory efficient data structure proposed by Chen et al.
In reference [5], the authors Bei Chen, HongCai Zhang, Wen Lun Cao, JianHu Feng introduced a new Huffman coding method based on number characters. The traditional 256 code table is replaced by the 09 characters, the space character and the enter character in this method. In reference [4] the authors, Wang, S.SW.Wu, J.J.L. Chuang, S.S.C. Hsiao, C.C.C. Tung, Y.Y.S. proposed a Lagrangian multiplier based penalty resource metric to be the targeting cost function and addresses the optimization problem of minimizing the number of memory access subject to a rate constraint for any Huffman decoding.
To lessen the memory size and fix the process of searching a symbol in a Huffman tree, Pi Chung Wang et al. proposed a memory efficient data structure to represent the Huffman tree utilizing the property of the encoded symbols, which uses memory nd bits, where n is the number of source symbols and d is the depth of the Huffman tree. Using singleside growing Huffman tree, based on the proposed data structure, an O (log n)time Huffman decoding algorithm is introduced [3, 6, 7].
In our proposed method the whole Huffman tree is traversed in a depth first fashion. When a leaf containing a character is to be saved, we save 7 bits of that character omitting the MSB. When returning back to the parent from a child node the relative position of the child is saved this takes one bit for each edge of the tree.
b. Rule 2: when returning back from a child node to its parent whether it’s a leaf or an internal node a single bit containing either 0 if the child is the left child of the parent or 1 if the child is the right child of the parent is saved.
2. Repeat steps i and ii until the traversal is not finished
i. If a leaf is encountered save the character in that position according to rule 1.
ii. When returning back from a node to its parent node save its relative position according to rule 2.
Output: Huffman tree
Decoding starts from the left of the bit stream.
Initialization of Current_subtree, subtree stack
Read_character () reads consecutive 7 bits and converts these
into an 8 bit ASCII code adding 0 to its MSB.
Read_position () reads a single bit.
Decode_and_build_tree ()
1. Read_character()
2. Current_subtree:= character read
3. while scanned bit6=NULLdo
4. if Current_subtree 6=NULLthen
5. Read_position()
6. if position = 0 then
7. Current_subtree is a left child. Push it into the stack.
8. Current_subtree := NULL
9. else
10. Pop topmost subtree from the stack
11. Create a new node with left child := popped out subtree
12. right child:=Current_subtree
13. Current_subtree := newly created node
14. end if
15. else
16. Read_character()
17. Current_subtree := character read
18. end if
19. end while
20. Current_subtree is the required Huffman tree.
Here we depict an example of implementation for clear understanding.
This tree (Figure 1, 2) will be encoded into the following bit sequence. Here every character consists of 7 bits. This representation has been used for easy understanding.
Besides, every ASCII coded character comprises of 8 bits where the MSB is 0 for each character. We are ignoring this bit while encoding so that we will require 7 bits for representing each character in our generated encoded bit sequence (explained in Figure 3, 4).
**In reference [1], the best case will be occurred when only the right most branch of the tree is expanded which is not a very frequent occurrence. In other cases redundant bits for saving the external limb and each upper leaf node will be required.
1. file1.txt="Huffman Coding Huffman Coding"
2. file2.txt="A new approach of memory efficient Huffman tree
Methods 
Memory Requirement 
Spaced savings of proposed method compared with existing method 
Original Huffman Coding 
2n bytes 
(2n*8)(9n2) = 7n+2 bits 
CM1 
2n3 bytes 
(2n3)*8(9n2) =7n22 bits 
Method of PiChung Wang; YuanRung Yang; ChunLiang Lee; HungYi Chang 
nd bits 
n(n1)(9n2) = =n2 10n+2 bits (approximated) 
Choudhury and Kaykobad Method 
3n/2 bytes at worst case* 
(3n/2)*8(9n2) = 3n2bits (approximated) 
Zinnia et al. Method 
10.75n3 bits at worst case** 
(10.75n3) (9n2) =1.75n1 bits (approximated) 
Proposed Method 
9n2 bits at worst case 

4. file4.txt ="I’ve implemented my proposed algorithm using programming language C because I like it most among all programming languages"
5. file5.txt="Best case complexity occurs when only 1 circular leaf node is considered or only one external limb is considered and all symbols except right brother leaf are the left leafs of the external limb"
The following are the comparison with different methods for above files (Table 2,3)
File Name 
Number of Symbols 
Number of bits needed to save the Huffman tree 

Original Huffman 
Proposed Method 
Zinnia et al. 
CM1 method 
Pi Chung Wang et al. method 
Choudhury and Kaykobad method 

file1.txt 
13 
208 
115 
134 
184 
156 
152 
file2.txt 
20 
320 
178 
207 
296 
280 
224 
file3.txt 
25 
400 
223 
264 
376 
375 
272 
file4.txt 
25 
400 
223 
248 
376 
375 
280 
file5.txt 
25 
400 
223 
269 
376 
400 
264 
File name 
Number of bits required ( negative value denotes increment of original file size) 

Original file Size 
Original Huffman 
Space Savings 
Proposed method 
Space savings 
Zinnia et al. method 
Space savings 
Choudhury &Kaykobad method 
Space savings 
Pi Chung Wang et al. method 
Space savings 
CM1 method 
Space savings 

file1.txt 
248 
321 
73 
228 
20 
247 
1 
265 
17 
269 
21 
297 
49 
Compression 

0% 

28.97 

23.05 

17.45 

16.20 

7.48 

 Sultana Z and Akter S. A New Approach of a Memory Efficient Huffman Tree Representation Technique. IEEE International conference(ICIEV). 2012. DOI: 10.1109/ICIEV.2012.6317482
 Rezaul C, Kaykobad M, Irwin K. An efficient Decoding Technique for Huffman Codes. Inform Process Letter. 2002;81(6):305308.
 PiChung W, YuanRung Y, ChunLiang L, HungYi C. A memory efficient Huffman decoding algorithm. Advanced Information Networking and Applications. 2005;2:475479. DOI: 10.1109/AINA.2005.33
 SungWen W, ShangChih C, ChihChieh H, YiShin T, Jaling W. An Efficient Memory Construction Scheme for an Arbitrary Side Growing Huffman Table. IEEE International Conference on Multimedia and Expo. 2006:141144. DOI: 10.1109/ICME.2006.262589
 Bei C, HongCai Z, WenLun C, JianHu F. Huffman Coding Method Based on Number Character. International Conference on Machine Learning and Cybernetics, 2007;4:22962299. DOI: 10.1109/ICMLC.2007.4370528
 SungWen W, JaLing W, ShangChih C, ChihChieh H, YiShin T. Memory Efficient Hierarchical Lookup Tables for Mass Arbitrary –Side Growing Huffman Trees Decoding. IEEE Transactions on Circuits and Systems for Video Technology. 2008:18(10):13351346. DOI: 10.1109/TCSVT.2008.920968
 Seymour L. Theory and problems of Data Structures. McgrawHill Book Company. 2002:249255.
 Hashemian R. Memory efficient and high speed search Huffman coding. IEEE Trans Comm. 1995;43(10):25762581. DOI: 10.1109/26.469442
 Yuhhe C, Jaling W. An SGHTree Based Huffman Decoding. Fourth International Conference on Information, Communications and Signal Processing. 2003;3:14831487. DOI: 10.1109/ICICS.2003.1292713
 Huffman DA. A method for construction of minimum redundancy codes. Proceedings of the IRE. 1952;40(9):10981101.
 Schack R. The length of a typical Huffman codeword. IEEE Transactions on Information Theory. 1994;40(4):12461247.C. DOI: 10.1109/18.335944
 HongChung C, YueLi W, YuFeng L. A memory efficient and fast Huffman decoding algorithm. Information Processing Letter. 1999;69(3):119122.