Code and parse trees for lossless source encodingThis paper surveys the theoretical literature on fixed-to-variable-length lossless source code trees, called code trees, and on variable-length-to-fixed lossless source code trees, called parse trees. In particular, the following code tree topics are outlined in this survey: characteristics of the Huffman (1952) code tree; Huffman-type coding for infinite source alphabets and universal coding; the Huffman problem subject to a lexicographic constraint, or the Hu-Tucker (1982) problem; the Huffman problem subject to maximum codeword length constraints; code trees which minimize other functions besides average codeword length; coding for unequal cost code symbols, or the Karp problem, and finite state channels; and variants of Huffman coding in which the assignment of 0s and 1s within codewords is significant such as bidirectionality and synchronization. The literature on parse tree topics is less extensive. Treated here are: variants of Tunstall (1968) parsing; dualities between parsing and coding; dual tree coding in which parsing and coding are combined to yield variable-length-to-variable-length codes; and parsing and random number generation. Finally, questions related to counting and representing code and parse trees are also discussed.
Synchronization of binary source codes (Corresp.)Bruce L. Montgomery, Julia Abrahams|IEEE Transactions on Information Theory|1986 The problem of achieving synchronization for variable-length source codes is addressed through the use of self-synchronizing binary prefix-condition codes. Although our codes are suboptimal in the sense of minimum average codeword length, they have the advantages of being generated by an explicit constructive algorithm, having minimal additional redundancy compared with optimal codes-as little as one additional bit introduced into the least likely codeword for a large class of sources-and having statistical synchronizing performance that improves on that of the optimal code in many cases.