HP-UX 11i March 2002 Release Notes

ManualsBrandsHP ManualsSoftwareHP-UX 11i v1.6 Technical Computing (TCOE) LTU

211

212

213

214

215

216

217

218

219

220

Programming Libraries

Chapter 13

218

speed performance for some kernel-threaded applications, by reducing mutex contention

among threads and by deferring coalescence of blocks.

The thread-private cache is only available for kernel-threaded applications, i.e. those

linked with the pthread library. The installed shared pthread library version must be

PHCO_19666 or later, or the application must be statically linked with an archive

pthread library that is version PHCO_19666 or later, or else cache is not available.

By default cache is not active and must be activated by setting _M_CACHE_OPTS to a

legal value. If _M_CACHE_OPTS is set to any out of range values, it is ignored and cache

remains disabled.

There are two portions to the thread private cache: one for ordinary blocks and one for

small blocks. Small blocks are blocks that are allocated by the small block allocator

(SBA), which is conﬁgured with the environment variable _M_SBA_OPTS or by calls to

mallopt(3C). The small block cache is automatically active whenever both the ordinary

block cache and the SBA are active. The ordinary block cache is active only when it is

conﬁgured by setting _M_CACHE_OPTS. There are no mallopt() options to conﬁgure the

thread-private cache.

The following shows _M_CACHE_OPTS’s subparameters and their meaning:

_M_CACHE_OPTS=

<bucket_size>

<retirement_age>

<bucket_size>

is (roughly) the numberof cached ordinary blocks per bucket that will be

held in the ordinary block cache. The allowable values range from 0 through 8*4096 =

32768. If

<bucket_size>

is set to 0, cache is disabled.

is the number of power of 2 buckets that will be maintained per thread. The

allowable values range from 8 though 32. This value controls the size of the largest

ordinary block that can be cached. For example, if

is 8, the largest ordinary

block that can be cached will be 2^8 or 256 bytes. If

is 16, the largest

ordinary block that can be cached will be 2^20 or 65536 bytes, etc.

<bucket_size>

is (exactly) the maximum number of ordinary blocks that

will be cached per thread. There is no maximum number of small blocks that will be

cached per thread if the small block cache is active.

<retirement_age>

controls what happens to unused caches. It may happen that an

application has more threads initially than it does later on. In that case, there will be

unused caches, because caches are not automatically freed on thread exit -- by default

they kept and assigned to newly-created threads. But for some applications, this could

result in some caches being kept indeﬁnitely and never reused.

<retirement_age>

sets

the maximum amount of time in minutes that a cache may be unused by any thread

before it is considered due for retirement. As threads are created and exit, caches due for

retirement are freed back to their arena. The allowable values of

<retirement_age>

range from 0 to 1440 minutes (=24*60, i.e. one day). If

<retirment_age>

is 0, retirement

is disabled and unused caches will be kept indeﬁnitely. It is recommended that

<retirement_age>

be conﬁgured to 0 unless space efﬁciency is important and it is

known that an application will stabilize to a smaller number of threads than its initial

number.

In general, kernel threaded applications that beneﬁt in performance from activating

the small block allocator may also beneﬁt further by activating a modest-sized ordinary

cache, which also activates caching small blocks (from which most of the beneﬁt is

derived). For example, a setting that might be tried to begin with would be:

_M_SBA_OPTS=256:100:8

_M_CACHE_OPTS=100:20:0