Programming with Judy
Caching and Memory Management
Appendix B62
What happened here? The software requires access to data that is not in
cache. Most of the time this means having to access an index that is not
in cache memory and having to fill cache from main memory. The more
frequently cache-line fills are required the more performance degrades
over time. If this gets bad enough, the computer spends most of its time
thrashing by repeatedly accessing main memory. Hence the saying,
“Thrashing is virtual crashing.”
Caching and
memory efficiency
Judy uses minimal memory to achieve efficient access to stored data.
Instead of caching data on local disk or network server, Judy arrays are
memory resident.
Judy is cache-line optimized software: able to retrieve lines of cached
data directly from the computer’s cache with maximum speed. Judy
minimizes cache-line fills, accessing an average of 4 to 6 cache-line fills to
get to any value in an array. That means you can store one, one hundred,
or 4 billion values in a Judy array. On average, Judy requires three
cache-line fills to reach the data in a four gigabyte (4GB) element array.
A one terabyte (1TB) element array would add one cache line fill.
Judy will never take more than 6 levels of indirection to reach the data.
Judy also minimizes the amount of memory required for array indexes.
The amount of memory used is proportional to the array population, the
density of the data, and the clustering of the data. JudyL was designed
to allocate not more than 12 bytes per index for 32-bit machines and 24
bytes per index for 64-bit machines. For JudyL, eight bytes of memory or
less per index is easily achievable on a 32-bit machine (16-bytes or less
on a 64-bit machine), a ratio that is better than most other in-memory
search and retrieval methods.