Grouping Numeric Data Into Ranges

In this section:

How to:

When you sort a report using a numeric sort field, you can group the sort field values together and define the range of each group.

There are several ways of defining groups. You can define groups of:

The FOR phrase is usually used to produce matrix reports and is part of the Financial Modeling Language (FML). However, you can also use it to create columnar reports that group sort field values in unequal ranges.

The FOR phrase displays the sort value for each individual row. The ranges do not have to be contiguous, that is, you can define your ranges with gaps between them. The FOR phrase is described in more detail in Creating Financial Reports With Financial Modeling Language (FML).

Note: If there is not any data for a group, a row for the group still appears in the report.

Syntax: How to Define Groups of Equal Range

{BY|ACROSS} sortfield IN-GROUPS-OF value [TOP limit]

where:

sortfield

Is the name of the sort field. The sort field must be numeric: its format must be I (integer), F (floating-point number), D (decimal number), or P (packed number).

value

Is a positive integer that specifies the range by which sort field values are grouped.

limit

Is an optional number that defines the highest group label to be included in the report.

Example: Defining Groups of Equal Ranges

The following illustrates how to show which employees fall into which salary ranges, and to define the ranges by $5,000 increments.

TABLE FILE EMPLOYEE
PRINT LAST_NAME
BY CURR_SAL IN-GROUPS-OF 5000
END

The output is:

  CURR_SAL  LAST_NAME
  --------  ---------
 $5,000.00  SMITH
            GREENSPAN
$10,000.00  STEVENS
            SMITH
$15,000.00  JONES
            MCCOY
            MCKNIGHT
$20,000.00  ROMANS
            BLACKWOOD
$25,000.00  BANNING
            IRVING
            CROSS

Syntax: How to Define Equal Ranges

{BY|ACROSS} sortfield IN-RANGES-OF value [TOP limit]

where:

sortfield

Is the name of the sort field. The sort field must be numeric: its format must be I (Integer), F (floating-point), D (double-precision), or P (packed).

value

Is an integer greater than zero indicating the range by which sort field values are grouped.

limit

Is an optional number that defines the highest range label to be included in the report. The range is extended to include all data values higher than this value.

Note: IN-RANGES-OF generates an internal sort phrase that must be counted in the total number of sort phrases.

Example: Defining Equal Ranges

TABLE FILE EMPLOYEE
PRINT LAST_NAME
BY CURR_SAL IN-RANGES-OF 5000
END

The output is:

CURR_SAL                           LAST_NAME
--------                           ---------
      $5,000.00 -       $9,999.99  SMITH    
                                   GREENSPAN
     $10,000.00 -      $14,999.99  STEVENS  
                                   SMITH    
     $15,000.00 -      $19,999.99  JONES    
                                   MCCOY    
                                   MCKNIGHT 
     $20,000.00 -      $24,999.99  ROMANS   
                                   BLACKWOOD
     $25,000.00 -      $29,999.99  BANNING  
                                   IRVING   
                                   CROSS

Syntax: How to Define Custom Groups of Data Values

FOR sortfield
begin1 TO end1 [OVER begin2 TO end2 ... ]

where:

sortfield

Is the name of the sort field.

begin

Is a value that identifies the beginning of a range.

end

Is a value that identifies the end of a range.

Example: Defining Custom Groups of Data Values

The following request displays employee salaries, but it groups them in an arbitrary way. Notice that the starting value of each range prints in the report.

TABLE FILE EMPLOYEE
PRINT LAST_NAME
FOR CURR_SAL
9000 TO 13500 OVER
14000 TO 19700 OVER
19800 TO 30000
END

The output is:

       LAST_NAME
       ---------
9000   STEVENS  
9000   SMITH    
9000   SMITH    
9000   GREENSPAN
14000  JONES    
14000  MCCOY    
14000  MCKNIGHT 
19800  BANNING  
19800  IRVING   
19800  ROMANS   
19800  BLACKWOOD
19800  CROSS

Grouping Numeric Data Into Tiles

How to:

Reference:

You can group numeric data into any number of tiles (percentiles, deciles, quartiles, etc.) in tabular reports. For example, you can group student test scores into deciles to determine which students are in the top ten percent of the class, or determine which sales representatives are in the top half of all sales representatives based on total sales.

Grouping is based on the values in the selected vertical (BY) field, and data is apportioned as equally as possible into the number of tile groups you specify.

The following occurs when you group data into tiles:

Syntax: How to Group Numeric Data Into Tiles

BY [ {HIGHEST|LOWEST} [k] ] tilefield [AS 'head1']
         IN-GROUPS-OF n TILES [TOP m] [AS 'head2']

where:

HIGHEST

Sorts the data in descending order so that the highest data values are placed in tile 1.

LOWEST

Sorts the data in ascending order so that the lowest data values are placed in tile 1. This is the default sort order.

k

Is a positive integer representing the number of tile groups to display in the report. For example, BY HIGHEST 2 displays the two non-empty tiles with the highest data values.

tilefield

Is the field whose values are used to assign the tile numbers.

head1

Is a heading for the column that displays the values of the tile sort field.

n

Is a positive integer not greater than 32,767, specifying the number of tiles to be used in grouping the data. For example, 100 tiles produces percentiles, while 10 tiles produces deciles.

m

Is a positive integer indicating the highest tile value to display in the report. For example, TOP 3 does not display any data row that is assigned a tile number greater than 3.

head2

Is a new heading for the column that displays the tile numbers.

Note:

  • The syntax accepts numbers that are not integers for k, n, and m. On z/OS, values with decimals are rounded to integers; on UNIX and Windows they are truncated. If the numbers supplied are negative or zero, an error message is generated.
  • Both k and m limit the number of rows displayed within each sort break in the report. If you specify both, the more restrictive value controls the display. If k and m are both greater than n (the number of tiles), n is used.

Example: Grouping Data Into Five Tiles

The following illustrates how to group data into five tiles.

TABLE FILE MOVIES
PRINT TITLE
BY CATEGORY
BY LISTPR IN-GROUPS-OF 5 TILES
WHERE CATEGORY EQ 'ACTION' OR 'CHILDREN'
END

The output is:

CATEGORY  LISTPR     TILE  TITLE
--------  ------     ----  -----
ACTION     14.95        1  TOP GUN
           19.95        2  JAWS
                           RAMBO III
           19.98        4  ROBOCOP
           19.99        5  TOTAL RECALL
CHILDREN   14.95        1  SESAME STREET-BEDTIME STORIES AND SONGS
           14.98        1  ROMPER ROOM-ASK MISS MOLLY
           19.95        2  SMURFS, THE
                           SCOOBY-DOO-A DOG IN THE RUFF
           26.99        3  BAMBI
           29.95        4  ALICE IN WONDERLAND
                           SLEEPING BEAUTY
           44.95        5  SHAGGY DOG, THE

Note that the tiles are assigned within the higher-level sort field CATEGORY. The ACTION category does not have any data assigned to tile 3. The CHILDREN category has all five tiles.

Example: Displaying the First Three Tile Groups

The following request prints only the first three tiles in each category:

TABLE FILE MOVIES
PRINT TITLE
BY CATEGORY
BY LOWEST 3 LISTPR IN-GROUPS-OF 5 TILES 
WHERE CATEGORY EQ 'ACTION' OR 'CHILDREN'
END

The output is:

CATEGORY  LISTPR     TILE  TITLE
--------  ------     ----  -----
ACTION     14.95        1  TOP GUN
           19.95        2  JAWS
                           RAMBO III
           19.98        4  ROBOCOP
CHILDREN   14.95        1  SESAME STREET-BEDTIME STORIES AND SONGS
           14.98        1  ROMPER ROOM-ASK MISS MOLLY
           19.95        2  SMURFS, THE
                           SCOOBY-DOO-A DOG IN THE RUFF
           26.99        3  BAMBI

Note that the request displays three tile groups in each category. Because no data was assigned to tile 3 in the ACTION category, tiles 1, 2, and 4 display for that category.

Example: Displaying Tiles With a Value of Three or Less

In the following request, the TOP 3 phrase restricts the display to tile numbers less than or equal to 3:

TABLE FILE MOVIES
PRINT TITLE
BY CATEGORY
BY LOWEST 3 LISTPR IN-GROUPS-OF 5 TILES TOP 3
WHERE CATEGORY EQ 'ACTION' OR 'CHILDREN'
END

The output is:

 CATEGORY  LISTPR     TILE  TITLE
 --------  ------     ----  -----
 ACTION     14.95        1  TOP GUN
            19.95        2  JAWS
                            RAMBO III
 CHILDREN   14.95        1  SESAME STREET-BEDTIME STORIES AND SONGS
            14.98        1  ROMPER ROOM-ASK MISS MOLLY
            19.95        2  SMURFS, THE
                            SCOOBY-DOO-A DOG IN THE RUFF
            26.99        3  BAMBI

Because no data was assigned to tile 3 in the ACTION category, only tiles 1 and 2 display for that category.

Example: Grouping Data Into Tiles and Customizing Column Headings

The following request changes the column headings for both the LISTPR and TILE columns:

TABLE FILE MOVIES
PRINT TITLE
BY CATEGORY
BY LISTPR AS 'PRICE' IN-GROUPS-OF 10 TILES TOP 3 AS 'DECILE'
WHERE CATEGORY EQ 'ACTION' OR 'CHILDREN'
END

The output is:

CATEGORY   PRICE   DECILE  TITLE
--------   -----   ------  -----
ACTION     14.95        1  TOP GUN
           19.95        3  JAWS
                           RAMBO III
CHILDREN   14.95        1  SESAME STREET-BEDTIME STORIES AND SONGS
           14.98        2  ROMPER ROOM-ASK MISS MOLLY
           19.95        3  SMURFS, THE
                           SCOOBY-DOO-A DOG IN THE RUFF

Reference: Usage Notes for Tiles

  • If a request retrieves data from segments that are descendants of the segment containing the tile field, multiple report rows may correspond to one instance of the tile field. These additional report rows do not affect the number of instances used to assign the tile values. However, if you retrieve fields from multiple segments and create a single-segment output file, this flat file will have multiple instances of the tile field, and this increased number of instances may affect the tile values assigned. Therefore, when you run the same request against the multi-level file and the single-segment file, different tile assignments may result.
  • Tiles are always calculated on a BY sort field in the request.
  • Only one tiles calculation is supported per request. However, the request can contain up to five (the maximum allowed) non-tile IN-GROUP-OF phrases in addition to the TILES phrase.
  • Comparisons for the purpose of assigning tile numbers use exact data values regardless of their display format. Therefore, if you display a floating-point value as D7, you may not be showing enough significant digits to indicate why values are placed in separate tiles.
  • The tile field can be a real field or a virtual field created with a DEFINE command or a DEFINE in the Master File. The COMPUTE command cannot be used to create a tile field.
  • Empty tiles do not display in the report output.
  • In requests with multiple sort fields, tiles are supported only at the lowest level and only with the BY LOWEST phrase.
  • Tiles are supported with output files. However, the field used to calculate the tiles propagates three fields to a HOLD file (the actual field value, the tile, and a ranking field) unless you set HOLDLIST to PRINTONLY.
  • Tiles are not supported with BY TOTAL, TABLEF, FML, and GRAPH.

Information Builders