Improving Efficiency With External Sorts

In this section:

How to:

Reference:

When a report is generated, by default it is sorted using an internal sorting procedure. This sorting procedure is optimized for reports of up to approximately 180 to 200K, although many factors affect the size of the data that can be handled by the internal sort.

The FOCSORT file used for the internal sort can grow to any size allowed by the operating system running and the available disk space. The user does not have to break a request up to accommodate massive files. In previous releases, the FOCSORT file was limited to 2 GB and the user received a FOC298 message when the WebFOCUS limit was exceeded. With no limit enforced by WebFOCUS, the operating system provides whatever warning and error handling it has for the management of a FOCSORT file that exceeds its limits.

You can generate larger reports somewhat faster by using dedicated sorting products, such as SyncSort, DFSORT, or, in non-Mainframe environments, the WebFOCUS external sort routines.

To use an external sort, the EXTSORT parameter must be ON. Use of a StyleSheet turns off external sorting.

Note that in Mainframe environments, external sorting is supported with the French, Spanish, German, and Scandinavian National Languages (Swedish, Danish, Finnish, and Norwegian). To specify the National Language Support Environment, use the LANG parameter as described in the Developing Reporting Applications manual.


Top of page

x
Reference: Requirements for External Sorting

You can use the DFSORT and SyncSort external sort products with any TABLE, FML, GRAPH, or MATCH request in all WebFOCUS Mainframe environments. In other operating environments, WebFOCUS has its own external sort routines.



x
Reference: Usage Notes for External Sorting in Non-Mainframe Environments

It is probably best not to use external sort if:


Top of page

x
Procedure: How to Determine the Type of Sort Used

To determine which sort is used, the following criteria are evaluated, in this sequence:

  1. BINS. If an entire report can be sorted within the work area (BINS), the external sort is not invoked, even if EXTSORT is set ON.
  2. EXTERNAL. If BINS is not large enough to sort the entire report and EXTSORT is set ON, the external sort utility will be invoked.


x
Syntax: How to Control External Sorting

You can turn the external sorting feature on and off using the SET EXTSORT command.

SET EXTSORT = {ON|OFF}

where:

ON
Enables the selective use of a dedicated external sorting product to sort reports. This value is the default in all Mainframe environments.
OFF
Uses the internal sorting procedure to sort all reports. This value is the default in all non-Mainframe environments.


x
Syntax: How to Query the Sort Type

To determine which sort is being used for a given report, issue the following command after the report request:

? STAT

The command displays the following values for the SORT USED parameter:

FOCUS
The internal sorting procedure was used to sort the entire report.
SQL
You are using a relational data source and the RDBMS supplied data already in order.
EXTERNAL
An external sorting product sorted the report.
NONE
The report did not require sorting.


x
Providing an Estimate of Input Records or Report Size for Sorting

How to:

There are two advantages to providing an estimate for the input size (ESTRECORDS) or the report size (ESTLINES):

  • If the request cannot be converted to a TABLEF request and the file size estimate shows that the external sort will be needed, FOCUS initiates the external sort immediately, which makes a FOCUS merge unnecessary. Without the estimate, such a request always performs this merge.
  • In Mainframe environments, FOCUS passes the file size to the external sort, which enables it to allocate work files of the appropriate size.


x
Syntax: How to Provide an Estimate of Input Records or Report Size for Sorting
ON TABLE SET ESTRECORDS nON TABLE SET ESTLINES n

where:

n
Is the estimated number of records or lines to be sorted.


x
Sort Work Files and Return Codes

Reference:

In non-Mainframe environments, external sorts use temporary work files to hold intermediate sorting results. For each type of external sort, you must be aware of how sort work files are created and used.



x
Reference: Sort Work Files on UNIX, Windows, and OpenVMS

While internal sorting uses only one work file, FOCSORT (allocated in the EDATEMP directory), external sort allows up to 31 work files, allocated on one or more disk drives (spindles) or directories. (Warning: Any one or more of these work files may become very large. Count on using many times the total disk space required by FOCSORT.)

By default, five work files are allocated in the /tmp directory on UNIX, or in the directory pointed to by the TMP environment variable in Windows. This may not be enough sort work space and, even if the files fit in the directory, five files are probably not enough for optimal performance. Also, having all of the sort work files on the same disk may further degrade performance.

You have two other options:

  • Define the TMPDIR shell variable (UNIX) or TMP environment variable (Windows) to point to some suitable writable directory. For best results, this directory should be on a disk with a lot of available space, and not the same disk as the data source or the EDATEMP directory. Again, you will get five temporary work files allocated on the same spindle, with consequent performance degradation.
  • Define 1 to 31 shell variables of the form IBITMPDIR01 ... IBITMPDIR31 to point to one or more writable directories.

    If the UNIX TMPDIR or Windows TMP variable is set, it must be "unset" in order to make use of the IBITMPDIRnn variables. The UNIX command for unsetting the TMPDIR variable is:

    unset TMPDIR

    The Windows command for unsetting the TMP variable is:

    SET TMP=

    Different variables may point to the same directory, if desired. If you wish to allocate n work files, you must define variables 01 through n. The first variable missing from the environment determines the number of work files that will be used. (If you define fewer than five, additional files will be allocated using the system default location to make up the difference.) The more work files you allocate, and the more separated they are across different spindles, the better performance you should achieve. The major constraint is the total disk space available.

The work file names are generated by the ANSI tempnam function, however, the names all begin with the characters srtwk. If the sorting process ends normally or terminates because of a detectable error (typically, disk space overflow), all of the allocated work files are deleted. There is no explicit way to save them. If there is another type of abnormal termination, srtwk files may be left on the disk. You can and should erase them.



x
Reference: Sort Work Files on i5/OS

On i5/OS (formerly AS/400), the number of work files is fixed at 9. They are virtual files.



x
Reference: WebFOCUS External Sort Return Codes

The WebFOCUS error message FOC909 is issued for all errors from external sort. An additional three-digit code is supplied, of which the last two digits are of interest. If you get an error number ending in:

  • 16, external sort did not have enough memory allocated. You can try reducing the number of work files.
  • 20, an I/O error occurred; in most cases, this means that one of the disks is not writable or has overflowed. Allocate the work files differently or reduce their number.
  • 28, one of the work files could not be opened. Check to make sure the pathname was specified correctly and that protections allow writing and reading.
  • 32, an internal logical error was detected in the sort processing. Report this problem to Information Builders.


x
Mainframe External Sort Utilities and Message Options

In this section:

How to:

By default, error messages created by a Mainframe external sort product are not displayed. However, you may wish to display these messages on your screen for diagnostic purposes.



x
Procedure: How to Select a Sort Utility and Message Options

You use the SET SORTLIB command to both specify the sort utility used at your site and, for DFSORT and SYNCSORT on z/OS, to display sort messages.

  1. Issue the SET SORTLIB command to specify the sort utility being used:
    SET SORTLIB = {sortutility|DEFAULT}

    where:

    sortutility
    Can be one of the following:

    DFSORT for DFSORT without messages.

    MVSMSGDF for DFSORT with messages.

    SYNCSORT for SyncSort without messages.

    MVSMSGSS for SyncSort with standard messages.

    MVSMSGSD for SyncSort with debug (verbose) messages.

    DEFAULT for DFSORT. However, It is more efficient and highly recommended that you explicitly specify the sort utility using one of the other values.

  2. If you specified a sort option that produces sort messages on z/OS, you must direct the sort messages to the batch output stream or a file.

    Allocate DDNAME SYSOUT to the batch output stream or a file on z/OS by inserting the appropriate following DD card into your server batch JCL, if it is not already there. For example, the following DD card allocates DDNAME SYSOUT to the batch output stream:

    //SYSOUT DD SYSOUT=*


x
Diagnosing External Sort Errors

When an external sort generates an error, you can generate a trace of sort processing and examine the FOCUS return codes and messages to diagnose the problem.





x
Reference: External Sort Messages and Return Codes

When you receive a FOC909 message, it includes a return code:

(FOC909)  CRITICAL ERROR IN EXTERNAL SORT.  RETURN CODE IS: xxxx

You may also receive one of the following messages:

(FOC1810) External sort not found
(FOC1899) Load of %1 (external-sort module) under %2 failed

The following notes apply when this message or a FOC1800 or FOC1899 message is generated by a TABLE request:

When REBUILD INDEX invokes an external sort that fails, it generates a message similar to the following:

ERROR OCCURRED IN THE SORT yyyyyyyyzzzzzzzz

In this case, the return code is yyyyyyyy and it is expressed in hex. The final eight digits (zzzzzzzz) should be ignored.

Translate the return code into decimal and follow the instructions for return codes in a TABLE request.

Note also that when a TABLE request generates a non-zero return code from an external sort, FOCUS is terminated. By contrast, when REBUILD INDEX gets a non-zero return code from an external sort, the REBUILD command is terminated but FOCUS continues.



x
Reference: Responding to an Indication of Inadequate Sort Work Space

Before following these instructions, make sure that external sort messages were generated (for information, see How to Select a Sort Utility and Message Options) and that they clearly show that the reason for failure was inadequate sort work space.

  1. Make an estimate of the number of lines of output the request will produce.
  2. Set the ESTLINES parameter in the request or FOCEXEC. For information, see Providing an Estimate of Input Records or Report Size for Sorting.

    WebFOCUS will pass this estimate to the external sort utility through the parameter list.

    Do not override the DD cards for SORTWKnn, S001WKnn, DFSPARM, or $ORTPARM without direct instructions from technical support. The instructions in How to Select a Sort Utility and Message Options, How to Trace Sort Processing, and Providing an Estimate of Input Records or Report Size for Sorting should provide equivalent capabilities.


Top of page

x
Aggregation by External Sort (Mainframe Environments Only)

How to:

Reference:

External sorts can be used to perform aggregation with a significant decrease in processing time in comparison to using the internal sort facility. The gains are most notable with relatively simple requests against large data sources.

When aggregation is performed by an external sort, the statistical variables &RECORDS and &LINES are equal because the external sort products do not return a line count for the answer set. This is a behavior change, and affects any code that checks the value of &LINES. (If you must test &LINES, do not use this feature.)



x
Syntax: How to Use Aggregation in Your External Sort
SET EXTAGGR = aggropt

where:

aggropt
Can be one of the following:

OFF disallows aggregation by an external sort.

NOFLOAT allows aggregation if there are no floating point data fields present.

ON allows aggregation by an external sort. This value is the default.



x
Reference: Usage Notes for Aggregating With an External Sort


Example: Changing Output by Using an External Sort for Aggregation

If you use SUM on an alphanumeric field in your report request without using an external sort, the last instance of the sorted fields is displayed in the output. Turning on aggregation in the external sort displays the first record instead. However, you can control the order of display using the SUMPREFIX parameter. With SUMPREFIX = LST (the default), the last instance displays even with EXTAGGR = ON.

The following command turns aggregation ON and leaves SUMPREFIX set to LST (the default) and, therefore, displays the last record:

SET EXTAGGR = ON
SET SUMPREFIX = LST
TABLE FILE CAR
SUM CAR BY COUNTRY
END

The output is:

COUNTRY     CAR     
-------     ---     
ENGLAND     TRIUMPH 
FRANCE      PEUGEOT 
ITALY       MASERATI
JAPAN       TOYOTA  
W GERMANY   BMW

Note: SUMPREFIX is described in Changing Retrieval Order With Aggregation (Mainframe Environments Only) .

With SUMPREFIX = FST, the output is:

COUNTRY     CAR       
-------     ---       
ENGLAND     JAGUAR    
FRANCE      PEUGEOT   
ITALY       ALFA ROMEO
JAPAN       DATSUN    
W GERMANY   AUDI

Top of page

x
Changing Retrieval Order With Aggregation (Mainframe Environments Only)

How to:

When an external sort product performs aggregation of alphanumeric or smart date formats, the order of the answer set returned differs from the order of the internally sorted answer sets.

External sort products return the first alphanumeric or smart date record that was aggregated. Conversely, internal sorting returns the last record.

The SUMPREFIX command allows users to choose the answer set display order.



x
Syntax: How to Set Retrieval Order
SET SUMPREFIX = {LST|FST}

where:

LST
Displays the last value when alphanumeric or smart date data types are aggregated. This value is the default.
FST
Displays the first value when alphanumeric or smart date data types are aggregated.


x
Creating a HOLD File With an External Sort (Mainframe Environments Only)

How to:

Reference:

You can use Mainframe external sort packages to create HOLD files, producing substantial savings in processing time. The gains are most notable with relatively simple requests against large data sources.



x
Syntax: How to Create HOLD Files With an External Sort
SET EXTHOLD = {OFF|ON}

where:

OFF
Disables HOLD files by an external sort.
ON
Enables HOLD files by an external sort. This value is the default.


x
Reference: Usage Notes for Creating a HOLD File With an External Sort
  • The default setting of EXTSORT=ON must be in effect.
  • EXTHOLD must be ON.
  • The request must contain a BY field.
  • The type of HOLD file created must be a FOCUS, XFOCUS, ALPHA, or BINARY file.
  • Your query should be simple. AUTOTABLEF analyzes a query and determines whether the combination of display commands and formatting options requires the internal matrix. In cases where it is determined that a matrix is not necessary to satisfy the query, you may avoid the extra internal costs associated with creating the matrix. The internal matrix is stored in a file or data set named FOCSORT. The AUTOTABLEF default is ON, in order to realize performance gains.
  • SET ALL must be OFF.
  • There cannot be an IF/WHERE TOTAL or BY TOTAL in the request.
  • If a request contains a SUM command, EXTAGGR must be set ON, and the only column prefixes allowed are SUM. and FST.

WebFOCUS