User guide
The view model editor
76 Data Integration with Sybase Avaki Studio
Performance of sort-based operators
Several operators must sort their inputs. These include Order By, Join (only when
using the Sort Merge algorithm), Group By, and Intersection, as well as any operator
that uses the “Distinct” option to eliminate duplicate rows. You should be aware of two
factors that can affect sort performance: sort chunk size and the location of temporary
files for sorts. These are discussed below.
Note This section is for more advanced users who wish to try to increase the
performance of data services deployed from view models. Many users will not
need to adjust these parameters.
Sort chunk size
To enable the sort-based operators to handle arbitrarily large result sets, Avaki breaks
the inputs into chunks and uses the local disk as a temporary backing store to hold
intermediate results. These intermediate results are cleaned up when the operation
completes.
There is a tradeoff between the amount of memory that an Avaki grid server uses for
its computation and the amount of I/O that it must perform to sort or join large result
sets. Avaki Studio can break the inputs into a large number of small chunks or into a
smaller number of larger chunks. A large number of small chunks requires less mem-
ory at any given time, but will result in more disk I/O. A small number of large chunks
will result in a larger peak memory usage, but less disk I/O overall.
By default, a view model operates with a chunk size of 10,000 rows. You can override
this default by declaring the following in your model’s .jsi file:
var overrideSortChunkSize = <new_chunk_size_value>;
Providing enough space for temporary sort files
When a data service sorts data, it writes temporary files to the temp directory specified
by the java.io.tmpdir system property on the grid server where the data service is run-
ning. The default location is <Avaki-install-dir>/jboss/server/grid-server/tmp. (See the
Sybase Avaki EII Administration Guide for information on setting system properties.)
The temporary sort files can be quite large. If you create a data service that sorts large
result sets, be sure that the temp directory has enough disk space to write large sort
files. If there isn’t enough disk space, the data service execution will fail because the
sort operation is unable to finish.