HP SVA V2.1 Parallel Compositing Reference Guide

ManualsBrandsHP ManualsSoftwareHP SVA Xeon Media & Doc & Base License

Relating this back to GLUT, the use of GLUT by the master is very much as you would

expect in a non-distributed application. On the other hand, the slaves use GLUT differently.

The slaves are run in full screen mode. They declare a display function and set their idle

function to post a re-display.

Techniques to Maximize Performance

• Best Pixel Format

The best performance format for depth compositing for this release is

PC_PF_BGR|PC_PF_Z32I.

The best compositing format for alpha compositing is PC_PF_BGRA8.

• Avoid Transferring Unnecessary Data

The Library lets you control what information is transmitted to hosts with outputs: only

color information or both color and depth. Setting the PC_OUTPUT_DEPTH context property

to 0 (the default value) reduces network traffic to the output node by 50% compared to

setting this property to 1.

Do not ask for output on nodes that do not display output.

Minimizing the size of framelets generated by a host is another good way to minimize

network traffic. A technique for doing this is to calculate a bounding rectangle around the

area on the frame where the host has provided no background pixels. The boundingbox

and multiple-framelets samples both do this successfully.

• Use the PC_HP_frame_output Extension Function If Possible

The HP Library provides two functions for returning pixel data:

— pcFrameResultChannel

— pcFrameWaitOutputHP

If you use pcFrameResultChannel to get composited data, then the call blocks until the

pixels for the specified rectangle become available. If the order in which the Library processes

the blocks does not match the order in which you ask for the results, then your code is

blocked. If you really need the pixels in a specific order (say, for writing to a file), then use

this function.

In general, you would just want to display the pixels using glDrawPixels. glDrawPixels

is a costly call. This means that the extra time spent waiting just adds to the latency of an

application. If you use pcFrameWaitOutputHP, then you get blocks of composited pixels

when they become available. This means that the glDrawPixels call proceeds in parallel

with the Library processing the next block. This reduces the overall latency of the application.

Additionally, you may choose to use bounding boxes to minimize data transfer. In this case,

the framelets drawn by the hosts might not cover the entire area of the frame (bounded by

PC_FRAME_WIDTH and PC_FRAME_HEIGHT). In such cases, pcFrameWaitOutputHP only

returns pixels for the regions defined by the framelets. The undefined regions are not

returned. A way to handle the undefined regions is to clear the display with the background

color before drawing the pixels returned by pcFrameWaitOutputHP.

• Use Sockets Optimized for Performance Rather Than Throughput

If you use a socket to communicate with the other hosts, disable nagling on the sockets to

get the best performance. Nagling is often the default behavior for a socket such that a

message is not transferred until a preset amount of data is ready to be sent, or a

predetermined timeout is reached. If your application has short packets of data, your data

is only transferred after the timeout is reached. This causes large and unnecessary delays in

the data transfer of your application.

Adding a TCP_NODELAY option to your socket gives better performance. For example,

Coding Tips 23