Common Misconfigured HP-UX Resources (April 2006)

Flushing the buffer cache: the syncer program
The syncer program is the process that flushes delayed write buffers to the physical disk.
Naturally, the larger the buffer cache, the more work that must be done by the syncer
program. The HP-UX 11.0 syncer is single threaded. It wakes up periodically and
sequentially scans the hash table for blocks that need to be written to the physical device. The
default syncer interval is 30 seconds, which means that every 30 seconds the entire buffer
cache hash table is scanned for delayed write blocks. The syncer program runs five times
during the syncer interval, scanning one-fifth of the buffer cache each time.
HP-UX 11i v1 and HP-UX 11i v2 are more efficient at this in that the syncer program is
multithreaded with one thread per CPU. Each CPU has its own dirty list and each syncer
thread is responsible for flushing buffers from its own dirty list. This improves buffer cache
scaling in that only dirty buffers are scanned and each thread has its own list preventing
contention around a single list.
Other sync operations
Various system operations require that dirty blocks in the buffer cache be written to disk
before the operation completes. Examples of such operations are the last close on a file, or
an unmount of a file system, or a sync system call. These operations are independent of the
syncer program and must traverse the entire buffer cache looking for blocks that need to be
written to the device. Once the operation completes, it must traverse the buffer cache hash
table again, invalidating the buffers that were flushed. These traversals of the buffer cache
hash chains can take time, particularly if there is contention around the hash lock.
IO throttling
Besides just walking the hash chains and locking/unlocking the hash locks, the larger the
buffer cache, the larger the number of dirty buffers that are likely to be in the cache needing
to be flushed to the disk. This can cause large amounts of write I/O to be queued to the disk
during sync operations. A read request could get delayed behind the writes and cause an
application delay. Flushes of the buffer cache can be throttled, limiting the number of buffers
that can be enqueued to a disk at one time. By default, throttling is turned off. For JFS 3.1,
you can enable throttling if the PHKL_27070 patch is installed by setting vx_nothrottle to
0. This alleviates read starvation at the cost of sync operations such as unmounting a file
system. For JFS 3.3 and later, you can control the amount of data flushed to a disk during
sync operations via the vxtunefs max_diskq parameter.
Write throttling
Setting max_diskq to throttle the flushing of dirty buffers has a disadvantage. Processes that
perform sync operations, such as umount or bdf, can stall since the writes are throttled.
Setting max_diskq does not prevent applications from continuing to perform asynchronous
writes. If writes to large files are being done, it is possible to exhaust all the buffers with dirty
buffers, which can delay reads or writes from other critical applications.
With JFS 3.5, a new tunable was introduced: write_throttle. This controls the number of
dirty buffers a single file can have outstanding. If an application attempts to write faster than
the data can be written to disk and the write_throttle amount has been reached, the
application will wait until some of the data is written to disk and the amount of data in the
dirty buffers falls back below the write_throttle amount.
Large I/O
The maximum size of a buffer page is 64 KB. For I/O requests larger than 64 KB, the request
must be broken down into multiple 64-KB I/O requests. Therefore, reading 256 KB from disk
13