HP-UX Reference (11i v2 04/09) - 3 Library Functions N-Z (vol 7)

p
pthread(3T) pthread(3T)
(Pthread Library)
When the application is executed, it produces a per-thread file of pthread events. This is used as input to
the ttv thread trace visualizer facility available in the HP/PAK performance application kit.
There are environment variables defined to control trace data files:
THR_TRACE_DIR
Where to place the trace data files. If this is not defined, the files go to the current working direc-
tory.
THR_TRACE_ASYNC
By default, trace records are buffered and only written to the file when the buffer is full. If this vari-
able is set to any non-NULL value, data is immediately written to the trace file.
THR_TRACE_EVENTS
By default, all pthread events are traced. If this variable is defined, only the categories defined will
be traced. Each category is separated by a ’:. The possible trace categories are:
thread:cond:mutex:rwlock
For example, to only trace thread and mutex operations set the
THR_TRACE_EVENTS
variable to:
thread:mutex
Details of the trace file record format can be found in
/usr/include/sys/trace_thread.h
.
See the ttv(1) manpage and built-in graphical help system for more information on the use of the trace
information.
PERFORMANCE CONSIDERATIONS
Often, an application is designed to be multithreaded to improve performance over its single-threaded
counterparts. However, the multithreaded approach requires some attention to issues not always of con-
cern in the single-threaded case. These are issues traditionally associated with the programming of mul-
tiprocessor systems.
The design must employ a lock granularity appropriate to the data structures and access patterns.
Coarse-grained locks, which protect relatively large amounts of data, can lead to undesired lock conten-
tion, reducing the potential parallelism of the application. On the other hand, employing very fine-
grained locks, which protect very small amounts of data, can consume processor cycles with too much
locking activity.
The use of thread-specific data (TSD) or thread-local storage (TLS) must be traded off, as described above
(see THREAD-SPECIFIC DATA).
Mutex spin and yield frequency attributes can be used to tune mutex behavior to the application. See
pthread_mutexattr_setspin_np(3T) and pthread_mutex_setyieldfreq_np(3T) for more information.
The default stacksize attribute can be set to improve system thread caching behavior. See
pthread_default_stacksize_np(3T) for more information.
Because multiple threads are actually running simultaneously, they can be accessing the same data from
multiple processors. The hardware processors coordinate their caching of data such that no processor is
using stale data. When one processor accesses the data (especially for write operations), the other pro-
cessors must flush the stale data from their caches. If multiple processors repeatedly read/write the same
data, this can lead to cache-thrashing which slows execution of the instruction stream. This can also
occur when threads access separate data items which just happen to reside in the same hardware-
cachable unit (called a cache line). This latter situation is called false-sharing which can be avoided by
spacing data such that popular items are not stored close together.
GLOSSARY
The following definitions were extracted from the text ThreadTime by Scott J. Norton and Mark D.
DiPasquale, Prentice-Hall, ISBN 0-13-190067-6, 1996.
Application Programming Interface (API)
An interface is the conduit that provides access to an entity or communication between entities. In the
programming world, an interface describes how access (or communication) with a function should take
place. Specifically, the number of parameters, their names and purpose describe how to access a func-
tion. An API is the facility that provides access to a function.
HP-UX 11i Version 2: September 2004 7 Hewlett-Packard Company Section 3747