2024年3月16日发(作者:马斯克减持特斯拉股份)

EFFICIENT HUFFMAN DECODING WITH TABLE LOOKUP

Mohamed F. Mansour

DSPS R&D Center, Texas Instruments Inc., USA

mfmansour@

ABSTRACT

We describe an efficient algorithm for Huffman decoding

using table lookup. The algorithm is optimized for ROM-

based Huffman decoding. It is a two-step process of prefix

template matching followed by a direct table access. We

propose an efficient algorithm for choosing the prefix

templates according to different optimization criteria. Also,

we propose different implementations for the prefix template

procedure.

Index Terms— Huffman decoding, Table lookup

1. INTRODUCTION

Huffman codes [1] have been widely used for source coding

and have shown high efficiency in exploiting the source

redundancy. Huffman codes along with run-length codes

have been widely used in most international multimedia

standards (e.g., MPEG and ISO standards [5], [7]).

Huffman decoding can be implemented with a lookup-table

(LUT) [2] or multiple lookup-tables [3]. If a single LUT is

used, the decoder throughput can be one codeword per cycle

whereas the throughput for multiple lookup tables is not

deterministic and in the worst case it equals the number of

lookup tables (assuming each LUT is processed in a single

cycle). The single LUT approach is usually adopted in high

efficiency Huffman decoder while the multiple LUTs

approach is usually used in low-power systems.

In this work, we propose a novel LUT-based approach for

Huffman decoding. The decoder has an LUT for a set of

prefix templates for the table codewords. Each prefix

template is associated with a direct access table for the

children codewords. During decoding, the input bits after the

prefix template are used to directly address the associated

codeword table to retrieve the correct codeword and its

length. We propose a novel approach for designing the

prefix templates which depends on a generic optimization

criterion that can be adjusted to the system. We propose

different criteria that can be employed in typical systems.

2. DECODING PROCEDURE

2.1. Background

Any Huffman code can be represented by a non-balanced

binary tree. The tree leaves represent the codewords of the

code. Any codeword has three attributes: the length, the

value, and the corresponding source symbol. An example of

a Huffman table of size 8 is shown in table 1 and the

corresponding tree representation is shown in Fig. 1. The

value of each internal node in Fig. 1 is the sum of its

children values and it is a measure of the internal node

probability.

Symbol Codeword Length Symbol Codeword Length

1 00111 5 5 010 3

2 00110 5 6 000 3

3 0010 4 7 11 2

4 011 3 8 10 2

Table 1. Example of Huffman Table of size 8.

36

21

12

6

3

2

6

3

1

5

9

4

8

15

7

Figure 1. Huffman Tree of the code in table 1

In general, the length of each codeword in the Huffman table

is inversely proportional to the probability of the

corresponding source symbol.

In our implementation, we use a set of prefix templates that

represent some internal nodes in the Huffman tree. Each

prefix template is parameterized by three attributes:

1. length (L): the length of the prefix value

(V): the bit value of this prefix

m child length (M): the maximum length of the

template children codewords.

For example, the internal node with label 12 in the Huffman

tree of Fig. 1, has the following attributes: L = 2, V = “00”,

1?4244?0728?1/07/$20.00 ?2007 IEEEII ? 53ICASSP 2007

M = 5 (which is equivalent to codewords 1 and 2). The

choice of the prefix templates is discussed in section 4.

r structure

Each prefix template is associated with a sub-table that

contains all children codewords. The size of the sub-table is

2

M

?

L

, where M and L are the attributes of the prefix template

as defined earlier. The indexing within the sub-table is done

using the last M-L bits of the input word that follow the L

bits of the prefix template. The sub-table is filled with the

children codewords of the templates with possible repetition

of certain codewords. For example, if the node with

frequency 12 in the Huffman Tree of Fig. 1 is selected as a

prefix template, then the size of its sub-table will be 8 and it

is organized as:

Sub-table

Address

No. of

symbols

Sub-table

Address

No. of

symbols

The input module is responsible for aligning the input

bitstream so that decoding starts at the correct word

boundary. The alignment is controlled by the length of the

last decoded codeword. The alignment procedure is similar

to previous algorithms (e.g., [2], [4]). The input to the prefix

LUT module has a length L

max

which is the maximum

template length. The inputs to the sub-table index generator

are the attributes L and M

?

L of the matched template and

M

max

bits of the input bitstream which is the maximum

codeword length in the Huffman table. The output is the

M

?

L bits from the bit stream starting from the (L+1)

st

bit.

The prefix LUT module is the most energy-demanding

module in the decoder. The objective of this work is to

propose efficient implementation of this module as will be

discussed in the following two sections.

Input bitstream

Input

Module

Prefix

LUT

Subtable Index

Generator

Template

index

Address

Generator

000 6 100 3

001 6 101 3

010 6 110 2

011 6 1111

Table 2. Memory map of the sub-table example

In this example we have only four codewords while the

overall memory is eight, i.e., we have a redundancy factor of

two. This redundancy is minimized by proper choice of the

prefix templates as will be discussed in section 4.

Note that, each symbol in the prefix sub-table has two

attributes: the value of the corresponding source symbol and

the codeword length.

The prefix templates are chosen such that, no template is a

prefix of another template. Therefore when we match the

input bitstream with the prefix templates, one and only one

template will be matched. This is also a design criterion that

is considered while generating the prefix templates.

ng Procedure

The decoding process consists of three basic steps:

ng the prefix templates

g the codeword symbol from the sub-table of the

selected prefix template using the bits that follow the

template for indexing within the sub-table.

ssing in the input bitstream by a number of bits

equals the codeword length to decode the following

symbol.

In step 1, to match a certain prefix template of attributes

(L,V,M), the first L bits of the bitstream should equal V. Two

attributes are associated with each prefix template, which

are, the number of indexing bits in its subtable, and the

starting address of its subtable. The overall decoding

procedure is illustrated in Fig. 2.

Template

Side information

M

?

L bits

Decoded symbol length

Decoded symbol

ROM

Figure 2. Huffman Decoding Procedure

3. PREFIX LUT IMPLEMENTATION

The prefix LUT module can be implemented in different

ways that depend on the structure of the Huffman table and

the target application.

The first choice is to use a programmable logic array (PLA)

as suggested in [4]. The cost of the PLA is proportional to

the number of templates which is significantly less than the

size of the Huffman table (which is used in [4]). In this case,

the prefix template matching can be performed in a single

cycle regardless of the matched template.

The second choice is to use a single comparator for

matching the prefix templates one at a time. This would

require a number of registers equals the number of

templates. To minimize the matching time, the templates are

arranged in descending order according to their

probabilities. The template probability equals the sum of the

probabilities of all its children (assuming source symbols are

independent). In Huffman codes the probability of each

source symbol is inversely proportional to the length of the

corresponding codeword. Therefore the probability of each

template is inversely proportional to the sum of the lengths

of its children codewords. The more accurate probability for

each template is obtained by scaling all individual

probabilities in (1) such that they sum to one. In the worst

case the number of cycles for prefix LUT equals the number

II ? 54

更多推荐

减持,股份,作者