HP-UX 11i Release Notes (December 2000)

ManualsBrandsHP ManualsSoftwareHP-UX 11i v1.6 Technical Computing (TCOE) LTU

251

252

253

254

255

256

257

258

259

260

Programming

Libraries

Chapter 13 259

More API's in libc may make use of the fastcall technology in future

releases. Appropriate changes to the header ﬁles will be delivered to

track these changes.

Performance Improvements to libc’s ftw(3C) and nftw(3C)

The libc functions ftw() and nftw() have been rewritten to operate

faster, avoid stack overﬂow conditions, reduce data space usage, and

improve parallelism in multi-threaded applications.

libc and commands which call ftw() and nftw() are affected.

ftw() ftw() was rewritten to eliminate internal recursion, thus

avoiding the possibility of a stack overﬂow on deep ﬁle trees. A single

ﬁxed-size data structure is allocated in the stack instead of using

malloc() to separate buffers for each depth of the tree. Use of

strlen() was eliminated, as well as trivial comparisons such as

strcmp(buf,"."). The ﬁle descriptor re-use algorithm was changed from

most-recently-opened to least-recently-opened which can show

signiﬁcant performance gains on very deep ﬁle trees.

ftw() will typically show 8% reductions in elapsed time and 50% or

more reduction in heap space used.

nftw() nftw() was rewritten similarly to ftw() with the same

beneﬁts. nftw() now fully conforms with the UNIX95 deﬁnition,

including the fact that when the FTW_PHYS is not set, ﬁles are reported

only once.

Threaded applications can obtain greater concurrency when specifying

absolute path names for the starting path, and FTW_CHDIR is not set. In

addition, an internal unbalanced binary tree was replaced with a much

more efﬁcient splay tree. The effect of this tree change becomes

signiﬁcant as the number of object inodes being tracked increases.

Directory inodes are always tracked, and when executing in UNIX95

mode and the FTW_PHYS option is not set, all ﬁles and directories are

tracked. When the number of tracked objects reaches about 20,000, the

user CPU time with the splay tree is about half the user CPU time for

the old nftw(). At 100,000 tracked inodes, the user CPU time is about

90% less for the splay tree.

Another performance improvement to nftw() eliminated calls to

access() by checking the mode bits in the stat() buffer. This

decreased system CPU time by approximately 4%.