Programming with Judy

Caching and Memory Management

Appendix B62

What happened here? The software requires access to data that is not in

cache. Most of the time this means having to access an index that is not

in cache memory and having to fill cache from main memory. The more

frequently cache-line fills are required the more performance degrades

over time. If this gets bad enough, the computer spends most of its time

thrashing by repeatedly accessing main memory. Hence the saying,

“Thrashing is virtual crashing.”

Caching and

memory efficiency

Judy uses minimal memory to achieve efficient access to stored data.

Instead of caching data on local disk or network server, Judy arrays are

memory resident.

Judy is cache-line optimized software: able to retrieve lines of cached

data directly from the computer’s cache with maximum speed. Judy

minimizes cache-line fills, accessing an average of 4 to 6 cache-line fills to

get to any value in an array. That means you can store one, one hundred,

or 4 billion values in a Judy array. On average, Judy requires three

cache-line fills to reach the data in a four gigabyte (4GB) element array.

A one terabyte (1TB) element array would add one cache line fill.

Judy will never take more than 6 levels of indirection to reach the data.

Judy also minimizes the amount of memory required for array indexes.

The amount of memory used is proportional to the array population, the

density of the data, and the clustering of the data. JudyL was designed

to allocate not more than 12 bytes per index for 32-bit machines and 24

bytes per index for 64-bit machines. For JudyL, eight bytes of memory or

less per index is easily achievable on a 32-bit machine (16-bytes or less

on a 64-bit machine), a ratio that is better than most other in-memory

search and retrieval methods.