Synergex.SynergyDE.Select.GroupBy

namespace Synergex.SynergyDE.Select
public class GroupBy

The GroupBy class provides a collection of static methods for grouping a selection after the entire selection has been done. GroupBy provides a subset of the functionality of a SQL “GROUP BY” clause, which arranges identical data into groups and returns distinct records based on field specifications. If multiple records have the same values in specified fields, only the first one in field order will be returned.

Once created, any object used by the Select (From, Where, NoCaseWhere, GroupBy, NoCaseGroupBy, On, OrderBy, NoCaseOrderBy, and Sparse) must remain in contact with the same record(s) specified in the From object(s) as well as fields referenced in the Where, NoCaseWhere, On, OrderBy, and Sparse objects that are contained in the record(s). In other words, you can’t create one of these objects and pass it to another routine unless that routine has direct access to the original record. If this is not the case, an InvalidOperationException or $ERR_INVOPER error (“Where operator requires at least one field reference”) will result when creating the Select object. On the other hand, all data references outside a From object’s record are copied during the object’s creation and are not required to remain in contact with the original data.

Methods

Ascending

public static varargs Ascending(field1, ...), @GroupBy

Declares a list of fields (MISMATCH n), separated by commas, by which the selected records will be grouped in ascending order upon being returned.

Descending

public static varargs Descending(field1, ...), @GroupBy

Declares a list of fields (MISMATCH n), separated by commas, by which the selected records will be grouped in descending order upon being returned.

Operators

The logical and operator (&& or .AND.) is supported and is equivalent to its Synergy counterpart. See Expressions for more information about this operator.

Discussion

The purpose of GroupBy is to aggregate results returned from a query based on shared values in specified fields. For example, if multiple records returned from a Select statement have the value “Sacramento” in the City field and GroupBy(“City”) is specified, all the records will be treated as a single record.

When Ascending() or Descending() is used, the entire selection is read from the specified file when the AlphaEnumerator object is created (on Select.GetEnumerator() or FOREACH). Changes to the file after making the selection may not be reflected in the records retrieved, especially if a record that matches the selection criteria is updated or inserted.

The fields specified by Ascending() and Descending() must be in the record specified in the From class object. If they are not, an InvalidOperationException or $ERR_INVOPER error (“Field reference not entirely part of specified record”) will be generated at runtime.

You can combine multiple fields in your group criteria using the .AND. operator. For example,

sobj = new Select(from, wobj, 
&  GroupBy.Descending(fld1).and.GroupBy.Ascending(fld2))

GroupBy is case sensitive. If you need case-insensitive grouping of records, use Synergex.SynergyDE.Select.GroupBy. GroupBy and NoCaseGroupBy can be combined if only some fields require case-insensitivity.

Unlike GROUP BY in SQL, results returned by GroupBy are always sorted based on the fields specified by Ascending() and Descending(). As a result, there is no need to use OrderBy with GroupBy to sort the results based on fields already in the GroupBy. Currently, you cannot combine GroupBy and OrderBy in the same Select statement, so any field you want to use to sort the results from a GroupBy must be included as part of the GroupBy.

Because GroupBy only returns the first record in field order for each unique set of values in the specified fields, any fields in the record that are not specified in GroupBy belong to that record alone and may not represent the group as a whole. For example, if you have a file of employee records and specify GroupBy(Lastname), it may return a record with “Smith“ in the Lastname field and “Fred“ in the Firstname field. This record will represent all records where Lastname is “Smith” even if the file contains other records with different values for Firstname. The value of “Fred” may not be meaningful when dealing with the group as a whole. To more easily distinguish meaningful fields in results returned by GroupBy, you may want to use Sparse in the Select statement and specify the same fields in Sparse that you specify in GroupBy. In that case, all fields that aren’t part of the GroupBy will be blanked out, leaving data only in the meaningful fields.

Examples

The following example which stores records to a file and performs three select queries on the file:

An OrderBy query
A simple GroupBy query
A more complicated GroupBy query that uses additional queries to simulate aggregate functions that might be used with GROUP BY in a SQL query

structure employee
    key,	  d5
    ,		  a1
    firstname,	  a15
    ,		  a1
    lastname,	  a15
    ,		  a1
    department,  a15
    ,		  a1
    salary,	  d7
endstructure

import Synergex.SynergyDE.Select

.main
record
    from,	@from
    selobj,	@select
    selobj2,   @select
    renum,	@RestrictedAlphaEnumerator
record rec
    data, employee
record rec2
    data, employee

record
    count,		i4
    salarysum,		d8
    salaryavg,		d7

.define FILE "DAT:employees.ism"
.proc
    xcall flags(7000000,1)
    open(1, o, "TT:")
    xcall isamc(FILE, ^size(rec), 1,
&		"START=1, LENGTH=5, NAME=""Key1"", DUPS, ASCEND, ATEND")
    open(2, u:I, FILE)
    store(2, "00001 Andrew          McDonald        Development     0025010")
    store(2, "00002 Bob             Brown           HR              0030020")
    store(2, "00003 Beth            Jones           HR              0035030")
    store(2, "00004 Cathy           Brown           Sales           0040040")
    store(2, "00005 David           Smith           HR              0045050")
    store(2, "00006 Fred            Smith           QA              0050060")
    store(2, "00007 George          Cartwright      Shipping        0055070")
    store(2, "00008 Henry           Smith           Development     0060080")
    store(2, "00009 Jennifer        Jones           QA              0065090")
    store(2, "00010 James           McDonald        Development     0070000")
    store(2, "00011 John            Cartwright      Development     0075010")
    store(2, "00012 Laura           Jones           QA              0080020")
    store(2, "00013 Matthew         Cartwright      Shipping        0085030")
    store(2, "00014 Phillip         Brown           Sales           0090040")
    store(2, "00015 Rose            Jones           Sales           0095050")
    close 2
    open(2, I:I, FILE)
    open(3, I:I, FILE)
    from = new from(2, rec)
    writes(1, "")

    writes(1, "OrderBy Department:")
    writes(1, "KEY   FIRSTNAME       LASTNAME        DEPARTMENT      SALARY")
    selobj = new select(from, OrderBy.Ascending(data.department))
    foreach rec in selobj
      begin
        writes(1,rec)
      end
    writes(1, "")

    writes(1, "GroupBy Department:")
    writes(1, "KEY   FIRSTNAME       LASTNAME        DEPARTMENT      SALARY")
    selobj = new select(from, GroupBy.Ascending(data.department))
    foreach rec in selobj
      begin
        writes(1,rec)
      end
    writes(1, "")

    writes(1, "GroupBy Department with aggregate values:")
    writes(1, "DEPARTMENT         COUNT     AVG(SALARY)  SUM(SALARY)")
    selobj = new select(from, GroupBy.Ascending(data.department))
    foreach rec in selobj
      begin
        count = 0
        salarysum = 0
        salaryavg = 0
        selobj2 = new select(new from(3,rec2),(where)(rec2.data.department == rec.data.department))
        foreach rec2 in selobj2
          begin
            count += 1
            salarysum += rec2.data.salary
          end
          salaryavg = salarysum/count
          writes(1,rec.data.department + "    " + %string(count,"XX") +
          & "    " + %string(salaryavg,"$$$,$$$,$$$") +
          & "    " + %string(salarysum,"$$,$$$,$$$"))
      end
    writes(1, "")
    close 3
    close 2
    close 1
    stop
.end

The output from the program looks like this:

OrderBy Department:
KEY   FIRSTNAME       LASTNAME        DEPARTMENT      SALARY
00001 Andrew          McDonald        Development     0025010
00008 Henry           Smith           Development     0060080
00010 James           McDonald        Development     0070000
00011 John            Cartwright      Development     0075010
00002 Bob             Brown           HR              0030020
00003 Beth            Jones           HR              0035030
00005 David           Smith           HR              0045050
00006 Fred            Smith           QA              0050060
00009 Jennifer        Jones           QA              0065090
00012 Laura           Jones           QA              0080020
00004 Cathy           Brown           Sales           0040040
00014 Phillip         Brown           Sales           0090040
00015 Rose            Jones           Sales           0095050
00007 George          Cartwright      Shipping        0055070
00013 Matthew         Cartwright      Shipping        0085030

GroupBy Department:
KEY   FIRSTNAME       LASTNAME        DEPARTMENT      SALARY
00001 Andrew          McDonald        Development     0025010
00002 Bob             Brown           HR              0030020
00006 Fred            Smith           QA              0050060
00004 Cathy           Brown           Sales           0040040
00007 George          Cartwright      Shipping        0055070

GroupBy Department with aggregate values:
DEPARTMENT         COUNT     SUM(SALARY)  AVG(SALARY)
Development        04        $57,525      $230,100
HR                 03        $36,700      $110,100
QA                 03        $65,056      $195,170
Sales              03        $75,043      $225,130
Shipping           02        $70,050      $140,100