Features Added in FOCUS 7.7.09

COALESCE: Returning the First Non-Missing Value

How to:

Return the First Non-Missing Value

Given a list of arguments, COALESCE returns the value of the first argument that is not missing. If all argument values are missing, it returns a missing value if MISSING is ON. Otherwise it returns a default value (zero or blank).

Syntax: How to Return the First Non-Missing Value

COALESCE(arg1, arg2, ...)

where:

arg1, arg2, ...

Any field, expression, or constant. The arguments should all be either numeric or alphanumeric.

Are the input parameters that are tested for missing values.

The output data type is the same as the input data types.

Example: Returning the First Non-Missing Value

This example uses the SALES data source with missing values added. The missing values are added by the following procedure named SALEMISS:

MODIFY FILE SALES                    
 FIXFORM STORE/4 DATE/5 PROD/4        
 FIXFORM UNIT/3 RETAIL/5 DELIVER/3    
 FIXFORM OPEN/3 RETURNS/C2 DAMAGED/C2 
 MATCH STORE                          
    ON NOMATCH REJECT                 
    ON MATCH CONTINUE                 
 MATCH DATE                           
    ON NOMATCH REJECT                 
    ON MATCH CONTINUE                 
 MATCH PROD_CODE                      
    ON NOMATCH INCLUDE                
    ON MATCH REJECT                   
DATA                                 
14Z 1017 C13 15 1.99 35 30   6       
14Z 1017 C14 18 2.05 30 25 4         
14Z 1017 E2  33 0.99 45 40           
END

The following request uses COALESCE to return the first non-missing value:

TABLE FILE SALES
PRINT DAMAGED RETURNS RETAIL_PRICE
COMPUTE
COAL1/D12.2 MISSING ON = COALESCE(DAMAGED, RETURNS, RETAIL_PRICE);
BY STORE_CODE
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The value of DAMAGED is returned, if it is not missing. If DAMAGED is missing, the value of RETURNS is returned, if it is not missing. If they are both missing, the value of RETAIL_PRICE is returned.

COMPACTFORMAT: Displaying Numbers in an Abbreviated Format

How to:

Display Numbers in an Abbreviated Format

COMPACTFORMAT displays numbers in a compact format where:

K is an abbreviation for thousands.
M is an abbreviation for millions.
B is an abbreviation for billions.
T is an abbreviation for trillions.

COMPACTFORMAT computes which abbreviation to use, based on the order of magnitude of the largest value in the column. The returned value is an alphanumeric string. Attempting to output this value to a numeric format will result in a format error, and the value zero (0) will be displayed.

Syntax: How to Display Numbers in an Abbreviated Format

COMPACTFORMAT(input)

where:

input: Is the name of a numeric field.

Example: Displaying Numbers in an Abbreviated Format

The following example uses the COMPACTFORMAT function to abbreviate the display of the summed values of the DAYSDELAYED, QUANTITY_SOLD, and COGS_US fields.

TABLE FILE WFLITE
SUM DAYSDELAYED QUANTITY_SOLD COGS_US
COMPUTE
CDAYS/A30= COMPACTFORMAT(DAYSDELAYED);
CQUANT/A30= COMPACTFORMAT(QUANTITY_SOLD);
CCOGS/A30= COMPACTFORMAT(COGS_US);
ON TABLE SET PAGE NOPAGE
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

FPRINT: Displaying a Value in a Specified Format

How to:

Display a Value in a Specified Format

Given an output format, the simplified conversion function FPRINT converts a value to alphanumeric format for display in that format.

Note: A legacy FPRINT function also exists and is still supported. The legacy function has an additional argument for the name or format of the returned value.

Syntax: How to Display a Value in a Specified Format

FPRINT(value, 'out_format')

where:

value

Any data type

Is the value to be converted.

'out_format'

Fixed length alphanumeric

Is the display format.

Example: Displaying a Value in a Specified Format

The following request displays COGS_US as format 'D9M', and TIME_DATE as format 'YYMtrD', by converting them to alphanumeric using FPRINT.

DEFINE FILE WFLITE
COGS_A/A25 = FPRINT(COGS_US, 'D9M');
DATE1/A25 = FPRINT(TIME_DATE, 'YYMtrD');
END
TABLE FILE WFLITE
PRINT LST.COGS_US COGS_A  DATE1
BY TIME_DATE
WHERE RECORDLIMIT EQ 10
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

FPRINT: Displaying a Value in a Specified Format

How to:

Display a Value in a Specified Format

Given an output format, the simplified conversion function FPRINT converts a value to alphanumeric format for display in that format.

Note: A legacy FPRINT function also exists and is still supported. The legacy function has an additional argument for the name or format of the returned value.

Syntax: How to Display a Value in a Specified Format

FPRINT(value, 'out_format')

where:

value

Any data type

Is the value to be converted.

'out_format'

Fixed length alphanumeric

Is the display format.

Example: Displaying a Value in a Specified Format

The following request displays COGS_US as format 'D9M', and TIME_DATE as format 'YYMtrD', by converting them to alphanumeric using FPRINT.

DEFINE FILE WFLITE
COGS_A/A25 = FPRINT(COGS_US, 'D9M');
DATE1/A25 = FPRINT(TIME_DATE, 'YYMtrD');
END
TABLE FILE WFLITE
PRINT LST.COGS_US COGS_A  DATE1
BY TIME_DATE
WHERE RECORDLIMIT EQ 10
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

GET_TOKEN: Extracting a Token Based on a String of Delimiters

How to:

Extract a Token Based on a String of Delimiters

GET_TOKEN extracts a token (substring) based on a string that can contain multiple characters, each of which represents a single-character delimiter.

GET_TOKEN can be optimized if there is a single delimiter character, not a string containing multiple delimiter characters.

Syntax: How to Extract a Token Based on a String of Delimiters

GET_TOKEN(string, delimiter_string, occurrence)

where:

string

Alphanumeric

Is the input string from which the token will be extracted. This can be an alphanumeric field or constant.

delimiter_string

Alphanumeric constant

Is a string that contains the list of delimiter characters. For example, '; ,' contains three delimiter characters, semi-colon, blank space, and comma.

occurrence

Integer constant

Is a positive integer that specifies the token to be extracted. A negative integer will be accepted in the syntax, but will not extract a token. The value zero (0) is not supported.

Example: Extracting a Token Based on a String of Delimiters

The following request defines an input string and two tokens based on a list of delimiters that contains the characters comma (,), semicolon (;), and slash (/).

DEFINE FILE EMPLOYEE
InputString/A20 = 'ABC,DEF;GHI/JKL';
FirstToken/A20 WITH DEPARTMENT = GET_TOKEN(InputString, ',;/', 1);
FourthToken/A20 WITH DEPARTMENT = GET_TOKEN(InputString, ',;/', 4);
ENDTABLE FILE EMPLOYEE
PRINT InputString FirstToken FourthToken
WHERE READLIMIT EQ 1
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID = OFF,$
END

The output is shown in the following image. The first token was extracted using the comma (,) as the delimiter. The fourth token was extracted using the slash (/) as the delimiter.

INITCAP: Capitalizing the First Letter of Each Word in a String

How to:

Capitalize the First Letter of Each Word in a String

INITCAP capitalizes the first letter of each word in an input string and makes all other letters lowercase. A word starts at the beginning of the string, after a blank space or after a special character.

Syntax: How to Capitalize the First Letter of Each Word in a String

INITCAP(input_string)

where:

input_string

Alphanumeric

Is the string to capitalize.

Example: Capitalizing the First Letter of Each Word in a String

The following request changes the last names in the EMPLOYEE data source to initial caps and capitalizes the first letter after each blank or special character in the NewName field.

TABLE FILE EMPLOYEE
PRINT LAST_NAME AND COMPUTE
Caps1/A30 = INITCAP(LAST_NAME);
NewName/A30 = 'abc,def!ghi'jKL MNO';
Caps2/A30 = INITCAP(NewName);
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

NULLIF: Returning a Null Value When Parameters Are Equal

How to:

Return a Null Value for Equal Parameters

NULLIF returns a null (missing) value when its parameters are equal. If they are not equal, it returns the first value. The field to which the value is returned should have MISSING ON.

Syntax: How to Return a Null Value for Equal Parameters

NULLIF(arg1,arg2)

where:

arg1,arg2

Any type of field, constant, or expression.

Are the input parameters that are tested for equality. They must either both be numeric or both be alphanumeric.

The output data type is the same as the input data types.

Example: Testing for Equal Parameters

The following request uses NULLIF to test the DAMAGED and RETURNS field values for equality.

DEFINE FILE SALES
NULL1/I4 MISSING ON = NULLIF(DAMAGED, RETURNS);
END
TABLE FILE SALES
PRINT DAMAGED RETURNS NULL1
BY STORE_CODE
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
END

The output is shown in the following image.

INCREASE: Calculating the Difference Between the Current and a Prior Value of a Field

How to:

Calculate the Difference Between the Current and a Prior Value of a Field

Given an aggregated input field and a negative offset, INCREASE calculates the difference between the value in the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

Syntax: How to Calculate the Difference Between the Current and a Prior Value of a Field

INCREASE([prefix.]field, offset)

where:

prefix

Is one of the following optional aggregation operators to apply to the field before using it in the calculation:

SUM. which calculates the sum of the field values. SUM is the default value.
CNT. which calculates a count of the field values.
AVE. which calculates the average of the field values.
MIN. which calculates the minimum of the field values.
MAX. which calculates the maximum of the field values.
FST. which retrieves the first value of the field.
LST. which retrieves the last value of the field.

field

Numeric

Is the field to be used in the calculation.

offset

Numeric

Is a negative number indicating the number of rows back from the current row to use for the calculation.

Example: Calculating the Increase Between the Current and a Prior Value of a Field

The following request uses the default value of SET PARTITION_ON (PENULITMATE) to calculate the increase within the PRODUCT_CATEGORY sort field between the current row and the previous row.

SET PARTITION_ON=PENULTIMATE
TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE INC = INCREASE(QUANTITY_SOLD,-1);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for INC is the value in the Accessories category for Quantity Sold, as there is no prior value. The second value for INC is the difference between the values for Headphones and Charger, the third is the difference between Universal Remote Controls and Headphones. Then, the calculations start over for Camcorder, which is the reset point.

PCT_INCREASE: Calculating the Percentage Difference Between the Current and a Prior Value of a Field

How to:

Calculate the Percentage Difference Between the Current and a Prior Value of a Field

Given an aggregated input field and a negative offset, PCT_INCREASE calculates the percentage difference between the value in the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

The percentage increase is calculated using the following formula:

(current_value - prior_value) / prior_value

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

Syntax: How to Calculate the Percentage Difference Between the Current and a Prior Value of a Field

PCT_INCREASE([prefix.]field, offset)

where:

prefix

Is one of the following optional aggregation operators to apply to the field before using it in the calculation:

SUM. which calculates the sum of the field values. SUM is the default value.
CNT. which calculates a count of the field values.
AVE. which calculates the average of the field values.
MIN. which calculates the minimum of the field values.
MAX. which calculates the maximum of the field values.
FST. which retrieves the first value of the field.
LST. which retrieves the last value of the field.

field

Numeric

The field to be used in the calculation.

offset

Numeric

Is a negative number indicating the number of rows back from the current row to use for the calculation.

Example: PCT_INCREASE: Calculating the Percent Increase Between the Current and a Prior Value of a Field

The following request uses the default value of SET PARTITION_ON (PENULITMATE) to calculate the percent increase within the PRODUCT_CATEGORY sort field between the current row and the previous row.

SET PARTITION_ON=PENULTIMATE
TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE PCTINC/D8.2p = PCT_INCREASE(QUANTITY_SOLD,-1);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for PCTINC is zero percent, as there is no prior value. The second value for PCTINC is the percent difference between the values for Headphones and Charger, the third is the percent difference between Universal Remote Controls and Headphones. Then, the calculations start over for Camcorder, which is the reset point.

PREVIOUS: Retrieving a Prior Value of a Field

How to:

Retrieve a Prior Value of a Field

Given an aggregated input field and a negative offset, PREVIOUS retrieves the value in a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Note: The values used in the retrieval depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

Syntax: How to Retrieve a Prior Value of a Field

PREVIOUS([prefix.]field, offset)

where:

prefix

Is one of the following optional aggregation operators to apply to the field before using it in the calculation:

SUM. which calculates the sum of the field values. SUM is the default value.
CNT. which calculates a count of the field values.
AVE. which calculates the average of the field values.
MIN. which calculates the minimum of the field values.
MAX. which calculates the maximum of the field values.
FST. which retrieves the first value of the field.
LST. which retrieves the last value of the field.

field

Numeric

The field to be used in the calculation.

offset

Numeric

Is a negative number indicating the number of rows back from the current row to use for the retrieval.

Example: Retrieving a Prior Value of a Field

The following request sets the PARITITON_ON parameter to TABLE and retrieves the value of the QUANTITIY_SOLD field two rows back from the current row.

SET PARTITION_ON=TABLE
TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE PREV = PREVIOUS(QUANTITY_SOLD,-2);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The value of PREV in the first two rows is zero, as there are no prior rows for retrieval. From then on, each value of PREV is from the QUANTITY_SOLD value from two rows prior, with no reset points.

RUNNING_AVE: Calculating an Average Over a Group of Rows

How to:

Calculate Running Average Between the Current and a Prior Value of a Field

Given an aggregated input field and a negative offset, RUNNING_AVE calculates the average of the values between the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the sort field specified, the entire table, or the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Syntax: How to Calculate Running Average Between the Current and a Prior Value of a Field

RUNNING_AVE(field, reset_key, lower)

where:

field

Numeric

The field to be used in the calculation.

reset_key

Identifies the point at which the running average restarts. Valid values are:

The name of a sort field in the request.
PRESET, which uses the value of the PARTITION_ON parameter, as described in Specify the Partition Size for Simplified Statistical Functions.
TABLE, which indicates that there is no break on a sort field.

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

lower

Is the starting point in the partition for the running average. Valid values are:

A negative number, which identifies the offset from the current row.
B, which specifies the beginning of the sort group.

Example: Calculating a Running Average

The following request calculates a running average of QUANTITY_SOLD within the PRODUCT_CATEGORY sort field, always starting from the beginning of the sort break.

TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE RAVE = RUNNING_AVE(QUANTITY_SOLD,PRODUCT_CATEGORY,B);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for RAVE is the value in the Accessories category for Quantity Sold, as there is no prior value. The second value for RAVE is the average of the values for Headphones and Charger, the third is the average of the values for Headphones, Charger, and Universal Remote Controls. Then, the calculations start over for Camcorder, which is the reset point.

RUNNING_MAX: Calculating a Maximum Over a Group of Rows

How to:

Calculate Running Maximum Between the Current and a Prior Value of a Field

Given an aggregated input field and an offset, RUNNING_MAX calculates the maximum of the values between the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the sort field specified, the entire table, or the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Syntax: How to Calculate Running Maximum Between the Current and a Prior Value of a Field

RUNNING_MAX(field, reset_key, lower)

where:

field

Numeric

The field to be used in the calculation.

reset_key

Identifies the point at which the running maximum restarts. Valid values are:

The name of a sort field in the request.
PRESET, which uses the value of the PARTITION_ON parameter, as described in Specify the Partition Size for Simplified Statistical Functions.
TABLE, which indicates that there is no break on a sort field.

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

lower

Is the starting point in the partition for the running maximum. Valid values are:

A negative number, which identifies the offset from the current row.
B, which specifies the beginning of the sort group.

Example: Calculating a Running Maximum

The following request calculates a running maximum for the rows from the beginning of the table to the current value of QUANTITY_SOLD, with no reset point.

TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE RMAX = RUNNING_MAX(QUANTITY_SOLD,TABLE,B);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for RMAX is the value in the Accessories category for Quantity Sold, as there is no prior value. The second value for RMAX is the value for Headphones, as that value is larger. The third value for RMAX is still the value for Headphones, as that value is larger than the Quantity Sold value in the third row. Since the maximum value in the table occurs for Blu Ray, that value is repeated on all future rows, as there is no reset point.

RUNNING_MIN: Calculating a Minimum Over a Group of Rows

How to:

Calculate Running Minimum Between the Current and a Prior Value of a Field

Given an aggregated input field and an offset, RUNNING_MIN calculates the minimum of the values between the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the sort field specified, the entire table, or the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Syntax: How to Calculate Running Minimum Between the Current and a Prior Value of a Field

RUNNING_MIN(field, reset_key, lower)

where:

field

Numeric

The field to be used in the calculation.

reset_key

Identifies the point at which the running minimum restarts. Valid values are:

The name of a sort field in the request.
PRESET, which uses the value of the PARTITION_ON parameter, as described in Specify the Partition Size for Simplified Statistical Functions.
TABLE, which indicates that there is no break on a sort field.

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

lower

Is the starting point in the partition for the running minimum. Valid values are:

A negative number, which identifies the offset from the current row.
B, which specifies the beginning of the sort group.

Example: Calculating a Running Minimum

The following request calculates a running minimum of QUANTITY_SOLD within the PRODUCT_CATEGORY sort field (the sort break defined by SET PARTITION_ON = PENULTIMATE), always starting from the beginning of the sort break.

SET PARTITION_ON=PENULTIMATE
TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE RMIN = RUNNING_MIN(QUANTITY_SOLD,PRESET,B);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for RMIN is the value in the Accessories category for Quantity Sold, as there is no prior value. The second value for RMIN is the value from the first row again (Charger), as that is smaller than the value in the second row. The third is the same again, as it is still the smallest. Then, the calculations start over for Camcorder, which is the reset point.

RUNNING_SUM: Calculating a Sum Over a Group of Rows

How to:

Calculate Running Sum Between the Current and a Prior Value of a Field

Given an aggregated input field and an offset, RUNNING_SUM calculates the sum of the values between the current row of the report output and a prior row, within a sort break or the entire table. The reset point for the calculation is determined by the sort field specified, the entire table, or the value of the PARTITION_ON parameter described in Specify the Partition Size for Simplified Statistical Functions.

Syntax: How to Calculate Running Sum Between the Current and a Prior Value of a Field

RUNNING_SUM(field, reset_key, lower)

where:

field

Numeric

The field to be used in the calculation.

reset_key

Identifies the point at which the running sum restarts. Valid values are:

The name of a sort field in the request.
PRESET, which uses the value of the PARTITION_ON parameter, as described in Specify the Partition Size for Simplified Statistical Functions.
TABLE, which indicates that there is no break on a sort field.

Note: The values used in the calculations depend on the sort sequence (ascending or descending) specified in the request. Be aware that displaying a date or time dimension in descending order may produce different results than those you may expect.

lower

Is the starting point in the partition for the running sum. Valid values are:

A negative number, which identifies the offset from the current row.
B, which specifies the beginning of the sort group.

Example: Calculating a Running Sum

The following request calculates a running sum of the current value and previous value of QUANTITY_SOLD within the reset point set by the PARTITION_ON parameter, which is the sort field PRODUCT_CATEGORY.

SET PARTITION_ON=PENULTIMATE
TABLE FILE wflite
SUM QUANTITY_SOLD
COMPUTE RSUM = RUNNING_SUM(QUANTITY_SOLD,PRESET,-1);
BY PRODUCT_CATEGORY
BY PRODUCT_SUBCATEG
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The first value for RSUM is the value in the Accessories category for Quantity Sold, as there is no prior value. The second value for RSUM is the sum of the values for Headphones and Charger, the third is the sum of the values for Charger and Universal Remote Controls. Then, the calculations start over for Camcorder, which is the reset point.

SET PRFTITLE: Generating Descriptive Column Titles for Prefixed Fields

How to:

Generate Descriptive Column Titles for Prefixed Fields

By default, the column titles for prefixed fields have the name of the prefix operator added above or below the field name. Using the SET PRFTITLE command, you can generate longer, more descriptive column titles. These descriptive column titles use longer descriptions of the prefix operators, which will also be translated into any standard language for which FOCUS is configured.

Syntax: How to Generate Descriptive Column Titles for Prefixed Fields

SET PRFTITLE = {SHORT|LONG}

ON TABLE SET PRFTITLE {SHORT|LONG}

where:

SHORT: Places the prefix operator name above the field name to generate the column title.
LONG: Generates descriptive column titles for prefixed fields that can be translated to other languages.

Example: Generating Descriptive Column Titles for Prefixed Fields

The following request uses prefix operators with the default value for SET PRFTITLE.

TABLE FILE WFLITE
SUM COGS_US CNT.COGS_US AVE.COGS_US CNT.DST.COGS_US MIN.COGS_US MAX.COGS_US MDN.COGS_US 
BY PRODUCT_CATEGORY
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

The following version of the request sets PRFTITLE to LONG.

SET PRFTITLE = LONG
TABLE FILE WFLITE
SUM COGS_US CNT.COGS_US AVE.COGS_US CNT.DST.COGS_US MIN.COGS_US MAX.COGS_US MDN.COGS_US 
BY PRODUCT_CATEGORY
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

Support for C-Style In-Line Comments

C-style comments can now be used in FOCUS procedures.

This type of comment is enclosed within an opening /* tag (slash followed by asterisk) and a closing */ tag (asterisk followed by slash). A C-style comment can appear anywhere and span multiple lines.

Example: Placing C-Style Comments in a Procedure

The following example places C-style comments in a procedure.

TABLE FILE WFLITE /* this is a multi-line comment
that will not interfere with processing and will be ignored
until the comment is closed with */
SUM /* Another comment */ COGS_US 
   .
   .
   .

SET PCTFORMAT: Displaying a Percent Sign for Percent Prefix Operators

The SET PCTFORMAT command controls whether fields prefixed with the operators PCT., RPCT., and PCT.CNT. display with a percent sign or with the format associated with the original field.

The syntax is:

SET PCTFORMAT = {OLD|PERCENT}

where:

OLD: Displays columns prefixed with PCT., RPCT., and PCT.CNT. with the format associated with the original field.
PERCENT: Displays columns prefixed with PCT., RPCT., and PCT.CNT. with a percent sign. It also allows the prefixed fields to be reformatted. This is the default value.

PCT.CNT.field will always display with two decimal places, unless reformatted. For PCT.field and RPCT.field, with SET PCTFORMAT = PERCENT, if the original field has a:

Precision-based format (F, D, M, X), the column will display with length 7 and two decimal places.
Packed format, the column will display with its original number of decimal places.
Integer format, the column will display with no decimal places.

Example: Displaying Columns for Fields Prefixed With PCT., RPCT., and PCT.CNT.

The following request displays columns that use the prefix operators PCT. and PCT.CNT against fields formatted to show currency symbols. The value of the PCTFORMAT parameter is set to OLD.

DEFINE FILE GGSALES
DOLL1/D12.4M = DOLLARS;
UNIT1/D12.6M = UNITS;
DOLL2/P12.4M = DOLLARS;
UNIT2/I7M = UNITS;
END
TABLE FILE GGSALES
SUM DOLLARS UNITS PCT.UNIT2 PCT.DOLL1 PCT.DOLL2 PCT.CNT.UNIT1
BY CATEGORY
ON TABLE SET PCTFORMAT OLD
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image. The columns that display prefixed fields, except for the one using the PCT.CNT. prefix, display currency symbols instead of percent signs because the formats of the original fields are used.

Changing SET PCTFORMAT to PERCENT produces the following output, in which the currency symbols have been removed and all prefixed columns display percent signs.

New Punctuation Option for Numbers Using SET CDN

How to:

Specify Punctuation for Numbers

The CDN parameter has a new option, SPACES_DOT or SPACEP, that displays a space as the thousands separator and a period as the decimal separator.

Syntax: How to Specify Punctuation for Numbers

SET CDN = option

where:

option

Is one of the following.

DOTS_COMMA or ON sets the decimal separator as a comma and the thousands separator as a period.
COMMAS_DOT or OFF sets the decimal separator as a period and the thousands separator as a comma.
SPACES_COMMA or SPACE sets the decimal separator as a comma, and the thousands separator as a space.
SPACES_DOT or SPACEP sets the decimal separator as a period, and the thousands separator as a space.
QUOTES_COMMA or QUOTE sets the decimal point as a comma and the thousands separator as an apostrophe.
QUOTES_DOT or QUOTEP sets the decimal point as a period and the thousands separator as an apostrophe.

Example: Displaying a Space as the Thousands Separator and a Period as the Decimal Separator

The following request uses the SPACEP option for the CDN parameter to display a space as the thousands separator and a period as the decimal separator.

TABLE FILE WFLITE
SUM COGS_US GROSS_PROFIT_US
BY PRODUCT_CATEGORY
ON TABLE SET CDN SPACEP
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

Two-Part Names Support in ? DEFINE

If a DEFINE command is issued using a two-part name (appname/filename), the output from the ? DEFINE query command will show the two-part name.

Example: Returning a Two-Part Name From the ? DEFINE Command

The following DEFINE command is issued using a two-part name.

DEFINE FILE ibisamp/GGSALES
DIFF = DOLLARS-BUDDOLLARS;
END

Issuing the ? DEFINE ibisamp/GGSALES command returns the following output.

FILE NAME            FIELD NAME                       FORMAT  SEGMENT   TYPE  
IBISAMP/GGSALES      DIFF                             D12.2         1

Enhancements to Error Messages

Many error messages have been simplified and unnecessary messages have been eliminated.

Two lines have been added to error messages related to parsing a Master File. The line that triggered the message is displayed below a line with a pointer to the position of the error. The pointer consists of a series of dashes and a down arrow (-----v) that identifies the location of the error in the line.

For example, in the WFLITE Master File, the DEFINE command for SHIPMENT_UNITY has been edited to have an extra equal sign (=).

FILENAME=WFLITE, TRANS_FILE=_EDAHOME/NLS/dt,
 REMARKS='Lite Cluster Join of Fact Tables Sales and Shipments for Demo Database', BV_NAMESPACE=OFF, $
  SEGMENT=WF_RETAIL_SALES, PARENT=., CRFILE=retail8203/facts/wf_retail_sales, CRINCLUDE=ALL,
    DESCRIPTION='Sales Fact', $
    DEFINE SALE_UNITY/I9C  WITH ID_SALES=1;
      TITLE='Sale Unit(s)', DESCRIPTION='Unity Value (1) for each Sales Fact', $
  SEGMENT=WF_RETAIL_SHIPMENTS, PARENT=., CRFILE=retail8203/facts/wf_retail_shipments, CRINCLUDE=ALL,
    DESCRIPTION='Shipments Fact', $
    DEFINE = SHIPMENT_UNITY/I9C  WITH ID_SHIPFACT=1;
      TITLE='Shipment Unit(s)', DESCRIPTION='Unity Value (1) for each Shipment Fact', $

Running a request against this Master File produces the following messages.

(FOC03601) ERROR AT OR NEAR LINE 10 IN APP01/WFLITE

(FOC1822) Invalid symbol in MFD parser: >=<

-----------v

DEFINE = SHIPMENT_UNITY/I9C WITH ID_SHIPFACT=1;

DTIME: Extracting Time Components From a Date-Time Value

How to:

Extract a Time Component From a Date-Time Value

Given a date-time value and time component keyword as input, DTIME returns the value of all of the time components up to and including the requested component. The remaining time components in the value are set to zero. The field to which the time component is returned must have a time format that supports the component being returned.

Syntax: How to Extract a Time Component From a Date-Time Value

DTIME(datetime, component)

where:

datetime

Date-time

Is the date-time value from which to extract the time component. It can be a field name or a date-time literal.

component

Keyword

Valid values are:

TIME. The complete time portion is returned. Its smallest component depends on the input date-time format. Nanoseconds are not supported or returned.
HOUR. The time component up to and including the hour component is extracted.
MINUTE. The time component up to and including the minute component is extracted.
SECOND. The time component up to and including the second component is extracted.
MILLISECOND. The time component up to and including the millisecond component is extracted.
MICROSECOND. The time component up to and including the microsecond component is extracted.

Example: Extracting Time Components

The following request defines two date-time fields:

TRANSTIME contains the extracted time components from TRANSDATE down to the minute.
TRANSTIME2 extracts all of the time components from the literal date-time value 2018/01/17 05:45:22.777888.

DEFINE FILE VIDEOTR2
TRANSTIME/HHISsm = DTIME(TRANSDATE, MINUTE);
TRANSTIME2/HHISsm = DTIME(DT(2018/01/17 05:45:22.777888), TIME);
END
TABLE FILE VIDEOTR2
SUM  TRANSTIME TRANSTIME2
BY MOVIECODE
BY  TRANSDATE 
WHERE MOVIECODE CONTAINS 'MGM'
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE 
END

The output is shown in the following image.

EDAPRINT: Inserting a Custom Message in the EDAPRINT Log File

How to:

Insert a Message in the EDAPRINT File

The EDAPRINT function enables you to allocate a sequential file to DDNAME EDAPRINT, add a text message into it, and assign it a message type.

Syntax: How to Insert a Message in the EDAPRINT File

EDAPRINT(message_type, 'message')

where:

message_type

Keyword

Can be one of the following message types.

I. Informational message.
W. Warning message.
E. Error message.

'message'

Is the message to insert, enclosed in single quotation marks.

Example: Inserting a Custom Message in the EDAPRINT File

The following procedure inserts three messages in the EDAPRINT file.

DYNAM ALLOC DD EDAPRINT DA USER1.EDAPRINT.LOG SHR REU
-SET &I = EDAPRINT(I, 'This is a test informational message');
-SET &W = EDAPRINT(W, 'This is a test warning message');
-SET &E = EDAPRINT(E, 'This is a test error message');

The following is the contents of the file allocated to DDNAME EDAPRINT after running the request.

00001 04/04/2019 10:52:15.483 I This is a test informational message
00002 04/04/2019 10:52:15.490 W This is a test warning message      
00003 04/04/2019 10:52:15.518 E This is a test error message

PATTERNS: Returning a Pattern That Represents the Structure of the Input String

How to:

Return a String That Represents the Pattern Profile of the Input Argument

PATTERNS returns a string that represents the structure of the input argument. The returned pattern includes the following characters:

A is returned for any position in the input string that has an uppercase letter.
a is returned for any position in the input string that has a lowercase letter.
9 is returned for any position in the input string that has a digit.

Note that special characters (for example, +-/=%) are returned exactly as they were in the input string.

The output is returned as variable length alphanumeric.

Syntax: How to Return a String That Represents the Pattern Profile of the Input Argument

PATTERNS(string)

where:

string

Alphanumeric

Is a string whose pattern will be returned.

Example: Returning a Pattern Representing an Input String

The following request returns patterns that represent customer addresses.

DEFINE FILE WFLITE
Address_Pattern/A40V = PATTERNS(ADDRESS_LINE_1);
END
 
TABLE FILE WFLITE
PRINT FST.ADDRESS_LINE_1 OVER
Address_Pattern
BY ADDRESS_LINE_1 NOPRINT SKIP-LINE
WHERE COUNTRY_NAME EQ 'United States'
WHERE CITY_NAME EQ 'Houston' OR 'Indianapolis' OR 'Chapel Hill' OR 'Bronx'
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The partial output is shown in the following image. Note that the special characters (#-,) in an address are represented in the pattern as is.

Simplified Statistical Functions

In this section:

CORRELATION: Calculating the Degree of Correlation Between Two Sets of Data
KMEANS_CLUSTER: Partitioning Observations Into Clusters Based on the Nearest Mean Value
STDDEV: Calculating the Standard Deviation for a Set of Data Values
MULTIREGRESS: Creating a Multivariate Linear Regression Column

How to:

Specify the Partition Size for Simplified Statistical Functions

The following simplified statistical functions have been added.

CORRELATION, which returns the degree of correlation between two independent sets of data.
KMEANS_CLUSTER, which partitions n observations into k clusters in which each observation belongs to the cluster with the nearest mean.
STDDEV, which quantifies the amount of variation or dispersion of a set of data values.
MULTIREGRESS, which derives a linear equation that best fits the data values of one or more numeric fields, and uses this equation to create a new column in the report output.
RSERVE, which runs an R script that returns vector output.

These functions can be called in a COMPUTE command and operate on the internal matrix that is generated during TABLE request processing. The STDDEV and CORRELATION functions can also be called as a verb object in a display command.

Prior to calling a statistical function, you need to establish the size of the partition on which these functions will operate, if the request contains sort fields.

Note: It is recommended that all numbers and fields used as parameters to these functions be double-precision.

Syntax: How to Specify the Partition Size for Simplified Statistical Functions

SET PARTITION_ON = {FIRST|PENULTIMATE|TABLE}

where:

FIRST: Uses the first (also called the major) sort field in the request to partition the values.
PENULTIMATE: Uses the next to last sort field where the COMPUTE is evaluated to partition the values. This is the default value.
TABLE: Uses the entire internal matrix to calculate the statistical function.

CORRELATION: Calculating the Degree of Correlation Between Two Sets of Data

How to:

Calculate the Correlation Coefficient Between Two Fields

The CORRELATION function calculates the correlation coefficient between two numeric fields. The function returns a numeric value between zero (-1.0) and 1.0.

Syntax: How to Calculate the Correlation Coefficient Between Two Fields

CORRELATION(field1, field2)

where:

field1

Numeric

Is the first set of data for the correlation.

field2

Numeric

Is the second set of data for the correlation.

Note: Arguments for CORRELATION cannot be prefixed fields. If you need to work with fields that have a prefix operator applied, apply the prefix operators to the fields in COMPUTE commands and save the results in a HOLD file. Then, run the correlation against the HOLD file.

Example: Calculating a Correlation

The following request calculates the correlation between the DOLLARS and BUDDOLLARS fields converted to double precision.

DEFINE FILE ibisamp/ggsales
DOLLARS/D12.2 = DOLLARS;
BUDDOLLARS/D12.2 = BUDDOLLARS;
END
TABLE FILE ibisamp/ggsales
SUM DOLLARS BUDDOLLARS 
CORRELATION(DOLLARS, BUDDOLLARS)
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

KMEANS_CLUSTER: Partitioning Observations Into Clusters Based on the Nearest Mean Value

How to:

Partition Observations Into Clusters Based on the Nearest Mean Value

The KMEANS_CLUSTER function partitions observations into a specified number of clusters based on the nearest mean value. The function returns the cluster number assigned to the field value passed as a parameter.

Note: If there are not enough points to create the number of clusters requested, the value -10 is returned for any cluster that cannot be created.

Syntax: How to Partition Observations Into Clusters Based on the Nearest Mean Value

KMEANS_CLUSTER(number, percent, iterations, tolerance, 
        [prefix1.]field1[, [prefix1.]field2 ...])

where:

number

Integer

Is number of clusters to extract.

percent

Numeric

Is the percent of training set size (the percent of the total data to use in the calculations). The default value is AUTO, which uses the internal default percent.

iterations

Integer

Is the maximum number of times to recalculate using the means previously generated. The default value is AUTO, which uses the internal default number of iterations.

tolerance

Numeric

Is a weight value between zero (0) and 1.0. The value AUTO uses the internal default tolerance.

prefix1, prefix2

Defines an optional aggregation operator to apply to the field before using it in the calculation. Valid operators are:

SUM. which calculates the sum of the field values. SUM is the default value.
CNT. which calculates a count of the field values.
AVE. which calculates the average of the field values.
MIN. which calculates the minimum of the field values.
MAX. which calculates the maximum of the field values.
FST. which retrieves the first value of the field.
LST. which retrieves the last value of the field.

Note: The operators PCT., RPCT., TOT., MDN., MDE., RNK., and DST. are not supported.

field1

Numeric

Is the set of data to be analyzed.

field2

Numeric

Is an optional set of data to be analyzed.

Example: Partitioning Data Values Into Clusters

The following request partitions the DOLLARS field values into four clusters and displays the result as a scatter chart in which the color represents the cluster. The request uses the default values for the percent, iterations, and tolerance parameters by passing them as the value 0 (zero).

SET PARTITION_ON = PENULTIMATE
GRAPH FILE GGSALES
PRINT UNITS DOLLARS
COMPUTE KMEAN1/D20.2 TITLE 'K-MEANS'=  KMEANS_CLUSTER(4, AUTO, AUTO, AUTO, DOLLARS);
ON GRAPH SET LOOKGRAPH SCATTER
ON GRAPH PCHOLD FORMAT JSCHART
ON GRAPH SET STYLE *
INCLUDE=Warm.sty,$
type = data, column = N2, bucket=y-axis,$
type=data, column= N1, bucket=x-axis,$
type=data, column=N3, bucket=color,$
GRID=OFF,$
*GRAPH_JS_FINAL
colorScale: {
		colorMode: 'discrete',
		colorBands: [{start: 1, stop: 1.99, color: 'red'}, {start: 2, stop: 2.99, color: 'green'}, 
               {start: 3, stop: 3.99, color: 'yellow'}, {start: 3.99, stop: 4, color: 'blue'} ]
	}
*END
ENDSTYLE
END

The output is shown in the following image.

STDDEV: Calculating the Standard Deviation for a Set of Data Values

The STDDEV function returns a numeric value that represents the amount of dispersion in the data. The set of data can be specified as the entire population or a sample. The standard deviation is the square root of the variance, which is a measure of how observations deviate from their expected value (mean). If specified as a population, the divisor in the standard deviation calculation (also called degrees of freedom) will be the total number of data points, N. If specified as a sample, the divisor will be N-1.

If x¡ is an observation, N is the number of observations, and µ is the mean of all of the observations, the formula for calculating the standard deviation for a population is:

To calculate the standard deviation for a sample, the mean is calculated using the sample observations, and the divisor is N-1 instead of N.

Reference: Calculate the Standard Deviation in a Set of Data

STDDEV(field, sampling)

where:

field

Numeric

Is the set of observations for the standard deviation calculation.

sampling

Keyword

Indicates the origin of the data set. Can be one of the following values.

P Entire population.
S Sample of population.

Note: Arguments for STDDEV cannot be prefixed fields. If you need to work with fields that have a prefix operator applied, apply the prefix operators to the fields in COMPUTE commands and save the results in a HOLD file. Then, run the standard deviation against the HOLD file.

Example: Calculating a Standard Deviation

The following request calculates the standard deviation of the DOLLARS field converted to double precision.

DEFINE FILE ibisamp/ggsales
DOLLARS/D12.2 = DOLLARS;
END
TABLE FILE ibisamp/ggsales
SUM DOLLARS STDDEV(DOLLARS,S) 
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

MULTIREGRESS: Creating a Multivariate Linear Regression Column

How to:

Create a Multivariate Linear Regression Column

MULTIREGRESS derives a linear equation that best fits a set of numeric data points, and uses this equation to create a new column in the report output. The equation can be based on one or more independent variables.

The equation generated is of the following form, where y is the dependent variable and x1, x2, and x3 are the independent variables.

y = a1*x1 [+ a2*x2 [+ a3*x3] ...] + b

When there is one independent variable, the equation represents a straight line. When there are two independent variables, the equation represents a plane, and with three independent variables, it represents a hyperplane. You should use this technique when you have reason to believe that the dependent variable can be approximated by a linear combination of the independent variables.

Syntax: How to Create a Multivariate Linear Regression Column

MULTIREGRESS(input_field1, [input_field2, ...])

where:

input_field1, input_field2 ...: Are any number of field names to be used as the independent variables. They should be independent of each other. If an input field is non-numeric, it will be categorized to transform it to numeric values that can be used in the linear regression calculation.

Example: Creating a Multivariate Linear Regression Column

The following request uses the DOLLARS and BUDDOLLARS fields to generate a regression column named Estimated_Dollars.

GRAPH FILE GGSALES
SUM BUDUNITS UNITS BUDDOLLARS DOLLARS
COMPUTE Estimated_Dollars/F8 = MULTIREGRESS(DOLLARS, BUDDOLLARS);
BY DATE
ON GRAPH SET LOOKGRAPH LINE
ON GRAPH PCHOLD FORMAT JSCHART
ON GRAPH SET STYLE *
INCLUDE=Warm.sty,$
type=data, column = n1, bucket = x-axis,$
type=data, column= dollars, bucket=y-axis,$
type=data, column= buddollars, bucket=y-axis,$
type=data, column= Estimated_Dollars, bucket=y-axis,$
*GRAPH_JS
"series":[
{"series":2, "color":"orange"}]
*END
ENDSTYLE
END

The output is shown in the following image. The orange line represents the regression equation.

Representing a Null String

In prior releases, two consecutive single quotation marks represented a string with one blank character. For example:

FIELD1/A1 = '';

Starting in this release, two consecutive single quotation marks will represent a null value with format A1V and an actual length of 0 (zero), when the field has MISSING ON. For example:

FIELD1/A1 MISSING ON = '';

If you want a blank character, add a blank between the single quotation marks. For example:

FIELD1/A1 = ' ';

Enhancements to the BYDISPLAY Parameter

How to:

Control Display of Sort Field Values on Report Output

By default, a sort field value displays only on the first row or column of the set of detail rows or columns generated for that sort field value. You can control this behavior using the BYDISPLAY parameter. In prior releases, BYDISPLAY was only supported for styled output formats and only for BY sort fields. However, it now applies to all output formats and can control display of ACROSS values as well as BY values.

Syntax: How to Control Display of Sort Field Values on Report Output

SET BYDISPLAY = {OFF|ON|BY|ACROSS|ALL}

ON TABLE SET BYDISPLAY {OFF|ON|BY|ACROSS|ALL}

where:

OFF: Displays a sort field value only on the first line or column of the report output for the sort group and on the first line or column of a page. OFF is the default value.
ON or BY: Displays the relevant BY field value on every line of report output produced. BY is a synonym for ON.
ACROSS: Displays the relevant ACROSS field value on every column of report output produced.
ALL: Displays the relevant BY field value on every line of report output and the relevant ACROSS field value on every column of report output.

Example: Controlling Display of Sort Field Values on Report Output

The following request generates a report on which sort field values only display when they change (BYDISPLAY OFF).

-SET &BYDISP = OFF;
SET BYDISPLAY = &BYDISP
TABLE FILE WFLITE
HEADING CENTER
" BYDISPLAY = &BYDISP"
" "
SUM QUANTITY_SOLD DAYSDELAYED
BY PRODUCT_CATEGORY 
BY PRODUCT_SUBCATEG
ACROSS BUSINESS_REGION 
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

Changing BYDISPLAY to ON or BY displays BY field values on every row, as shown in the following image.

Changing BYDISPLAY to ACROSS displays ACROSS field values over every column, as shown in the following image.

Changing BYDISPLAY to ALL displays BY field values on every row and ACROSS field values over every column, as shown in the following image.

Enhancement to SET COUNTWIDTH

You can now specify an integer value for the width of a COUNT column on report output. The syntax is:

SET COUNTWIDTH = {OFF|ON|n}

where:

OFF: Makes the format of the COUNT field I5. This is the default value.
ON: Makes the format of the COUNT field I9.
n: Enables you to specify a width for the COUNT field up to the maximum integer format supported in your operating environment.

Change to Default Value for SET MISSINGTEST

How to:

Control Evaluation of MISSING in IF-THEN-ELSE Expressions

The MISSINGTEST parameter now has three values, OLD, SPECIAL, and NEW. The value NEW has been added and is the default value.

In prior releases, by default, when an IF-THEN-ELSE expression was used to calculate a result and the IF expression evaluated to zero (for numeric expressions) or blank (for alphanumeric expressions), the left hand side was checked to see if it had MISSING ON. If it did, and only some values were needed (NEEDS SOME), the result of the IF expression was MISSING, not true or false. The outcome returned was also MISSING, not the result of evaluating the THEN or ELSE expression. The SET MISSINGTEST = NEW command eliminates the missing test for the IF expression so that either the THEN expression or the ELSE expression will be evaluated and returned as the result. This is the new default behavior.

Syntax: How to Control Evaluation of MISSING in IF-THEN-ELSE Expressions

SET MISSINGTEST = {NEW|OLD|SPECIAL}

where:

NEW: Excludes the IF expression from the missing values evaluation so that the IF expression results in either true or false, not MISSING. If it evaluates to true, the THEN expression is used to calculate the result. If it evaluates to false, the ELSE expression is used to calculate the result. This is the default value.
OLD: Includes the IF expression in the missing values evaluation. If the IF expression evaluates to MISSING and the missing field only needs some missing values, the result is also MISSING.
SPECIAL: Is required for passing parameters to RStat.

Example: Using SET MISSINGTEST With IF-THEN-ELSE Expressions

The following request defines a field named MISS_FIELD that contains a missing value for the country name Austria. In the TABLE request there are two calculated values, CALC1 and CALC2 that test this field in IF-THEN-ELSE expressions. Both of these fields have MISSING ON and need only some missing values to be missing:

SET MISSINGTEST = OLD
DEFINE FILE WFLITE
MISS_FIELD/A10 MISSING ON = IF COUNTRY_NAME NE 'Austria' THEN 'DATAEXISTS' ELSE MISSING;
END
 
TABLE FILE WFLITE
SUM COGS_US MISS_FIELD
COMPUTE CALC1/A7 MISSING ON = IF ((MISS_FIELD EQ '') OR (MISS_FIELD EQ MISSING)) THEN 'THEN' ELSE 'ELSE';
COMPUTE CALC2/A7 MISSING ON = IF ((MISS_FIELD EQ MISSING) OR (MISS_FIELD EQ '')) THEN 'THEN' ELSE 'ELSE';
BY COUNTRY_NAME
WHERE BUSINESS_REGION EQ 'EMEA'
WHERE COUNTRY_NAME LT 'E'
ON TABLE SET NODATA 'MISSING'
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

Running the request with MISSINGTEST=OLD produces the output shown in the following image:

Note that for Austria, MISS_FIELD is MISSING.

In CALC1, the expression MISS_FIELD EQ ' ' is evaluated to true. MISS_FIELD IS MISSING is not evaluated at all because evaluation stops when it can be determined that the result of the expression is true. However, since the expression compared the field to blank, it is checked to see if the result field supports missing values. Since it does, the final result is MISSING.
In CALC2, the expression MISS_FIELD EQ MISSING is true. MISS_FIELD EQ ' ' is not evaluated at all because evaluation stops when it can be determined that the result of the expression is true. No missing check is needed, so the result of the IF expression is TRUE, and the THEN expression is evaluated and returned as the result.

Changing the SET command to SET MISSINGTEST=NEW and rerunning the request produces the output shown in the following image. The IF expressions in CALC1 and CALC2 both evaluate to true because neither expression is checked to see if the result field supports missing, so the THEN expression is evaluated and returned as the result in both cases.

SET MISS_ON: Setting MISSING ON Behavior for DEFINE and COMPUTE

How to:

Set a Default Value for MISSING ON in DEFINE and COMPUTE

When a virtual field or calculated value can have missing values, you can specify whether all or some of the field values used in the expression that creates the DEFINE or COMPUTE field must be missing to make the result field missing. If you do not specify ALL or SOME for a DEFINE or COMPUTE with MISSING ON, the default value is SOME.

The SET parameter MISS_ON enables you to specify whether SOME or ALL should be used for MISSING ON in a DEFINE or COMPUTE that does not specify which to use.

Syntax: How to Set a Default Value for MISSING ON in DEFINE and COMPUTE

SET MISS_ON = {SOME|ALL}

where:

SOME: Indicates that if at least one field in the expression has a value, the temporary field has a value (the missing values of the field are evaluated as 0 or blank in the calculation). If all of the fields in the expression are missing values, the temporary field has a missing value. SOME is the default value.
ALL: Indicates that if all the fields in the expression have values, the temporary field has a value. If at least one field in the expression has a missing value, the temporary field has a missing value.

Example: Setting a Default Value for MISSING ON in DEFINE and COMPUTE

The following request creates three virtual fields that all have MISSING ON. Field AAA has all missing values. Field BBB is missing only when the category is Gifts and has the value 100 otherwise. Field CCC is the sum of AAA and BBB.

SET MISS_ON = SOME
DEFINE FILE GGSALES
AAA/D20 MISSING ON = MISSING;
BBB/D20 MISSING ON = IF CATEGORY EQ 'Gifts' THEN MISSING ELSE 100;
CCC/D20 MISSING ON = AAA + BBB;
END
TABLE FILE GGSALES
SUM
AAA
BBB
CCC
BY CATEGORY
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
END

Running the request with SET MISS_ON=SOME (the default) shows that CCC has a value unless both AAA and BBB are missing.

Changing SET MISS_ON to ALL, produces the following output. CCC is assigned a missing value because one of the fields used to calculate it is always missing.

Enhancement to the SUMPREFIX Parameter

How to:

Control Display of Aggregated Alphanumeric or Smart Date Fields

The SUMPREFIX parameter allows you to specify which value will be displayed when aggregating an alphanumeric or smart date field in the absence of any prefix operator. The default value is LST, which will return the physical last value within the sort group. FST has been available, which will return the first physical value in the sort group. The setting has been enhanced to include MIN and MAX to return either the minimum value or maximum value within the sort group.

MIN and MAX correspond to the SQL MIN and MAX functions and can be easily converted to SQL when optimizing a request against a relational database.

Syntax: How to Control Display of Aggregated Alphanumeric or Smart Date Fields

SET SUMPREFIX = {FST|LST|MIN|MAX}

where:

FST: Displays the first value when alphanumeric or smart date data types are aggregated.
LST: Displays the last value when alphanumeric or smart date data types are aggregated. LST is the default value.
MIN: Displays the minimum value in the sort order set by your FOCUS code page and configuration when alphanumeric or smart date data types are aggregated.
MAX: Displays the maximum value in the sort order set by your FOCUS code page and configuration when alphanumeric or smart date data types are aggregated.

Example: Displaying the Minimum Value for an Aggregated Alphanumeric Field

The following request sets SUMPREFIX to MIN and displays the aggregated PRODUCT_CATEGORY and DAYSDELAYED values as well as the minimum, maximum, first, and last PRODUCT_CATEGORY values. In each row, the aggregated PRODUCT_CATEGORY value matches the MIN.PRODUCT_CATEGORY value. The DAYSDELAYED numeric field is not affected by the SUMPREFIX value and is aggregated.

SET SUMPREFIX = MIN
TABLE FILE WFLITE
SUM PRODUCT_CATEGORY DAYSDELAYED MIN.PRODUCT_CATEGORY MAX.PRODUCT_CATEGORY 
     FST.PRODUCT_CATEGORY LST.PRODUCT_CATEGORY
BY BRAND
WHERE BRAND GT 'K' AND BRAND LT 'U'
ON TABLE SET PAGE NOLEAD
ON TABLE SET STYLE *
GRID=OFF,$
ENDSTYLE
END

The output is shown in the following image.

Enhancements to Sort Performance

Reference:

Sort Parameters

The sorting procedure has been enhanced to analyze the request being processed and the amount of sort memory available in order reduce the amount of disk I/O. The sort strategy is controlled by the specifics of the request and the values of the SORTMATRIX and SORTMEMORY parameters.

Reference: Sort Parameters

SORTMATRIX

The SORTMATRIX parameter controls whether to employ in-memory sorting with decreased use of external memory. The syntax is

SET SORTMATRIX = {SMALL|LARGE}

where:

SMALL: Creates a single sort matrix of up to 2048 rows, and uses a binary search based insertion sort with aggregation during retrieval. The maximum number of rows in this matrix has been determined to provide the best performance for this type of sort. If the sort matrix becomes full, it is written to a file called FOCSORT on disk, the in-memory matrix is emptied, and retrieval continues, writing to FOCSORT as many times as necessary. When the end of data is detected, the remaining rows are written to FOCSORT and the merge routine merges all of the sort strings in FOCSORT (which, in extreme cases, may require multiple merge phases), while also completing the aggregation.
LARGE: Creates a large matrix or multiple small matrices in memory, when adequate memory is available as determined by the SORTMEMORY parameter. LARGE is the default value. The goal of this strategy is to do as much sorting as possible in internal memory before writing any records to disk. Whether disk I/O is necessary at all in the sorting process depends on the amount of memory allocated for sorting and the size of the request output. If the amount of SORTMEMORY is not large enough to meaningfully make use of the LARGE strategy, the sort will default to the SMALL strategy. The LARGE strategy greatly reduces the need for disk I/O and, if disk I/O is required after all (for very large output), it virtually eliminates the need for multiple merge phases.

SORTMEMORY

The SORTMEMORY parameter controls the amount of internal memory available for sorting. The syntax is

SET SORTMEMORY = {n|512}

where:

n: Is the positive number of megabytes of memory available for sorting. The default value is 512.