Programming with Judy

About Judy
Chapter 110
because Judy stores the indexes in sorted order allowing compression of
the common digits in the index word.
With binary-tree structures, you expect the speed to be close to a log
2
N
function of the number of elements stored. Judy's speed is closer to
log
256
N. Unlike many data structures, Judy only slows a little when the
population of indexes increases by a factor of 200. With this gentle slope,
Judy performs well into the tera-element populations if that much
memory is available. And excellent memory efficiency makes
multi-dimensional Judy arrays possible (probably best-in-class).
Scalability is Judy's greatest strength. However, the Judy technology has
many more benefits, including fast search and retrieval, and built-in
sorting and counting functions.
Design goals Judy is designed to replace many common data structures, such as
arrays, sparse arrays, hash tables, B-trees, binary trees, linear lists,
skiplists, and counting functions. Judy functions combine sparse arrays
with innovative memory management and retrieval.
Judy is implemented as a tree and dynamically optimizes each node
based on the population under that node. Judy also optimizes memory
access to be more efficient than other algorithms that ignore memory
caching. As a result, Judy exhibits better than O(log
256
N) retrieval
behavior.
1
Judy has been used successfully for arrays with as little as one
element (actually zero) to arrays with billions of elements. Since the
technology is self-adapting, its design can support future machines with
much larger memories than currently available.
Like many other algorithms, such as B-trees but notably excluding
hashing, Judy keeps indexes in sorted order.
2
Judy also includes a
powerful counting function that returns the number of data elements
between any two keys or counts to any ordinal (position) within an
array.
3
For example, Judy can store the first million prime numbers.
1. Big O notation is used to compare two algorithms that accomplish
the same task. It is a theoretical measure of the execution of an
algorithm, usually in terms of the time or memory needed. For
more information, visit this web site:
www.wiu.edu/users/mflll/cs350/onot.htm .
2. Because Judy necessarily defines the sorted order, the order
cannot be modified to include user-defined sorting criteria (see
page 26).