HP-UX 11i March 2002 Release Notes

ManualsBrandsHP ManualsSoftwareHP-UX 11i v1.6 Technical Computing (TCOE) LTU

211

212

213

214

215

216

217

218

219

220

Programming

Libraries

Chapter 13

217

• nftw()

nftw() was rewritten similarly to ftw() with the same beneﬁts. nftw() now fully

conforms with the UNIX95 deﬁnition, including the fact that when the FTW_PHYS is

not set, ﬁles are reported only once.

Threaded applications can obtain greater concurrency when specifying absolute path

names for the starting path, and FTW_CHDIR is not set. In addition, an internal

unbalanced binary tree was replaced with a much more efﬁcient splay tree. The effect

of this tree change becomes signiﬁcant as the number of object inodes being tracked

increases. Directory inodes are always tracked, and when executing in UNIX95 mode

and the FTW_PHYS option is not set, all ﬁles and directories are tracked. When the

number of tracked objects reaches about 20,000, the user CPU time with the splay

tree is about half the user CPU time for the old nftw(). At 100,000 tracked inodes,

the user CPU time is about 90% less for the splay tree.

Another performance improvement to nftw() eliminated calls to access() by

checking the mode bits in the stat() buffer. This decreased system CPU time by

approximately 4%.

Two defects were ﬁxed in nftw():

— When the FTW_CHDIR option is set, directories are considered unreadable unless

they have both read and execute permissions. (The old nftw() would try to

chdir() into a directory without execute permissions and then abort the walk

with an error).

— When the FTW_CHDIR option is set, a directory object is reported to the user

function before it is chdir()'ed into.

nftw() improvements vary depending on options provided, with the most signiﬁcant

improvements seen in UNIX95 standard mode with the FTW_PHYS option not set, or

when a very large number of directories exist in the ﬁle tree being traversed.

Impact

The code size of ftw() and nftw() has increased by about 40%, but the heap

requirements are reduced by 50% or more.

Minimally, you should ﬁnd that ftw() operates about 6% faster and nftw() 4% faster.

On very large ﬁle trees where the number of tracked inodes is in the tens of thousands or

more, the performance gain of nftw() could be 30% to 40% or more.

If you relied on the FTW_CHDIR defects which were mentioned above, there may need to

be an application change.

Documentation

The ftw (3C) and nftw (3C) manpages have been updated, particularly with respect to

the two defect ﬁxes and means of achieving best concurrency in threaded applications.

Performance Improvements to libc’s malloc()

A new environment variable, _M_CACHE_OPTS, is available to help tune malloc()

performance in kernel-threaded applications. This environment variable conﬁgures a

thread-private cache for malloc’ed blocks. If cache is conﬁgured, malloc’ed blocks are

placed into a thread's private cache when free() is called, and may thereafter be

allocated from cache when malloc() is called. Having such a cache potentially improves