2024年3月16日发(作者:马斯克减持特斯拉股份)
EFFICIENT HUFFMAN DECODING WITH TABLE LOOKUP
Mohamed F. Mansour
DSPS R&D Center, Texas Instruments Inc., USA
mfmansour@
ABSTRACT
We describe an efficient algorithm for Huffman decoding
using table lookup. The algorithm is optimized for ROM-
based Huffman decoding. It is a two-step process of prefix
template matching followed by a direct table access. We
propose an efficient algorithm for choosing the prefix
templates according to different optimization criteria. Also,
we propose different implementations for the prefix template
procedure.
Index Terms— Huffman decoding, Table lookup
1. INTRODUCTION
Huffman codes [1] have been widely used for source coding
and have shown high efficiency in exploiting the source
redundancy. Huffman codes along with run-length codes
have been widely used in most international multimedia
standards (e.g., MPEG and ISO standards [5], [7]).
Huffman decoding can be implemented with a lookup-table
(LUT) [2] or multiple lookup-tables [3]. If a single LUT is
used, the decoder throughput can be one codeword per cycle
whereas the throughput for multiple lookup tables is not
deterministic and in the worst case it equals the number of
lookup tables (assuming each LUT is processed in a single
cycle). The single LUT approach is usually adopted in high
efficiency Huffman decoder while the multiple LUTs
approach is usually used in low-power systems.
In this work, we propose a novel LUT-based approach for
Huffman decoding. The decoder has an LUT for a set of
prefix templates for the table codewords. Each prefix
template is associated with a direct access table for the
children codewords. During decoding, the input bits after the
prefix template are used to directly address the associated
codeword table to retrieve the correct codeword and its
length. We propose a novel approach for designing the
prefix templates which depends on a generic optimization
criterion that can be adjusted to the system. We propose
different criteria that can be employed in typical systems.
2. DECODING PROCEDURE
2.1. Background
Any Huffman code can be represented by a non-balanced
binary tree. The tree leaves represent the codewords of the
code. Any codeword has three attributes: the length, the
value, and the corresponding source symbol. An example of
a Huffman table of size 8 is shown in table 1 and the
corresponding tree representation is shown in Fig. 1. The
value of each internal node in Fig. 1 is the sum of its
children values and it is a measure of the internal node
probability.
Symbol Codeword Length Symbol Codeword Length
1 00111 5 5 010 3
2 00110 5 6 000 3
3 0010 4 7 11 2
4 011 3 8 10 2
Table 1. Example of Huffman Table of size 8.
36
21
12
6
3
2
6
3
1
5
9
4
8
15
7
Figure 1. Huffman Tree of the code in table 1
In general, the length of each codeword in the Huffman table
is inversely proportional to the probability of the
corresponding source symbol.
In our implementation, we use a set of prefix templates that
represent some internal nodes in the Huffman tree. Each
prefix template is parameterized by three attributes:
1. length (L): the length of the prefix value
(V): the bit value of this prefix
m child length (M): the maximum length of the
template children codewords.
For example, the internal node with label 12 in the Huffman
tree of Fig. 1, has the following attributes: L = 2, V = “00”,
1?4244?0728?1/07/$20.00 ?2007 IEEEII ? 53ICASSP 2007
M = 5 (which is equivalent to codewords 1 and 2). The
choice of the prefix templates is discussed in section 4.
r structure
Each prefix template is associated with a sub-table that
contains all children codewords. The size of the sub-table is
2
M
?
L
, where M and L are the attributes of the prefix template
as defined earlier. The indexing within the sub-table is done
using the last M-L bits of the input word that follow the L
bits of the prefix template. The sub-table is filled with the
children codewords of the templates with possible repetition
of certain codewords. For example, if the node with
frequency 12 in the Huffman Tree of Fig. 1 is selected as a
prefix template, then the size of its sub-table will be 8 and it
is organized as:
Sub-table
Address
No. of
symbols
Sub-table
Address
No. of
symbols
The input module is responsible for aligning the input
bitstream so that decoding starts at the correct word
boundary. The alignment is controlled by the length of the
last decoded codeword. The alignment procedure is similar
to previous algorithms (e.g., [2], [4]). The input to the prefix
LUT module has a length L
max
which is the maximum
template length. The inputs to the sub-table index generator
are the attributes L and M
?
L of the matched template and
M
max
bits of the input bitstream which is the maximum
codeword length in the Huffman table. The output is the
M
?
L bits from the bit stream starting from the (L+1)
st
bit.
The prefix LUT module is the most energy-demanding
module in the decoder. The objective of this work is to
propose efficient implementation of this module as will be
discussed in the following two sections.
Input bitstream
Input
Module
Prefix
LUT
Subtable Index
Generator
Template
index
Address
Generator
000 6 100 3
001 6 101 3
010 6 110 2
011 6 1111
Table 2. Memory map of the sub-table example
In this example we have only four codewords while the
overall memory is eight, i.e., we have a redundancy factor of
two. This redundancy is minimized by proper choice of the
prefix templates as will be discussed in section 4.
Note that, each symbol in the prefix sub-table has two
attributes: the value of the corresponding source symbol and
the codeword length.
The prefix templates are chosen such that, no template is a
prefix of another template. Therefore when we match the
input bitstream with the prefix templates, one and only one
template will be matched. This is also a design criterion that
is considered while generating the prefix templates.
ng Procedure
The decoding process consists of three basic steps:
ng the prefix templates
g the codeword symbol from the sub-table of the
selected prefix template using the bits that follow the
template for indexing within the sub-table.
ssing in the input bitstream by a number of bits
equals the codeword length to decode the following
symbol.
In step 1, to match a certain prefix template of attributes
(L,V,M), the first L bits of the bitstream should equal V. Two
attributes are associated with each prefix template, which
are, the number of indexing bits in its subtable, and the
starting address of its subtable. The overall decoding
procedure is illustrated in Fig. 2.
Template
Side information
M
?
L bits
Decoded symbol length
Decoded symbol
ROM
Figure 2. Huffman Decoding Procedure
3. PREFIX LUT IMPLEMENTATION
The prefix LUT module can be implemented in different
ways that depend on the structure of the Huffman table and
the target application.
The first choice is to use a programmable logic array (PLA)
as suggested in [4]. The cost of the PLA is proportional to
the number of templates which is significantly less than the
size of the Huffman table (which is used in [4]). In this case,
the prefix template matching can be performed in a single
cycle regardless of the matched template.
The second choice is to use a single comparator for
matching the prefix templates one at a time. This would
require a number of registers equals the number of
templates. To minimize the matching time, the templates are
arranged in descending order according to their
probabilities. The template probability equals the sum of the
probabilities of all its children (assuming source symbols are
independent). In Huffman codes the probability of each
source symbol is inversely proportional to the length of the
corresponding codeword. Therefore the probability of each
template is inversely proportional to the sum of the lengths
of its children codewords. The more accurate probability for
each template is obtained by scaling all individual
probabilities in (1) such that they sum to one. In the worst
case the number of cycles for prefix LUT equals the number
II ? 54
更多推荐
减持,股份,作者
发布评论