User guide
Data Integration with Sybase Avaki Studio 121
Group By
Group By
The Group By operator is very similar to the Aggregate operator. Its output includes
columns generated with the same aggregation functions. The difference is that
whereas the Aggregate operator applies these functions to the entire set of input rows,
the Group By divides the input rows up into groups based on criteria you specify and
then applies the aggregation functions to the subset of rows in each group. This creates
one row for each group rather than the Aggregate operator’s single row of output for
the entire result set.
The aggregate columns in the Group By operator are defined identically to those in
the Aggregate operator. See “Aggregate” on page 108 for details. To define the col-
umn or columns that will determine the grouping, you simply pick one or more of
them from the input source. A group consists of all rows of the input where the values
for these columns are identical. For example, if your input, which describes employee
information, contains a column named DEPT_NO which is that employee’s department
number, you can group by that column and then obtain aggregate information for the
employees in each department. If you specify more than one group-by column, the
values for those columns must be the same throughout all rows of the group.
You also have the option when defining the group-by rows of including that column in
the output result set. In the example above, therefore, we could choose to include the
DEPT_NO column in our result set. Otherwise, only the columns calculated based on
the aggregate functions are included.
The aggregate functions available for use are the same as in the Aggregate operator:
Sum, Average, Maximum, Minimum, Count, First, Last, Population Variance, Sample
Variance, Population Standard Deviation, and Sample Standard Deviation. They are
described in detail under “Aggregate functions” on page 110.
Connections
The Group By operator takes exactly one input result set, which can be the output of
an Input Source or of any other operator. It produces exactly one output result set
whose schema is defined by the operator.
Required Connection