Multi-Dimensional Index (MDI)

In this section:

 

A multi-dimensional index (MDI) enables you to efficiently and flexibly retrieve information you need for business analysis. It looks at data differently from transaction processing systems in which the goal is to retrieve records based on a key. (WebFOCUS uses a B-tree index for this type of retrieval). The MDI is for retrieval only. It is not used for MODIFY or Maintain requests.

Business analysts may be interested in specific facts (data values, also called measures) about multiple categories of data in the data source. Categories of data, such as region or department, are referred to as dimensions. A multi-dimensional index uses dimensions and all of their hierarchical relationships to point to specific facts.

The MDI is a multi-field index that contains at least two dimensions. This index behaves like a virtual cube of values that intersect at measures of interest. The more dimensions used in a query, the better the retrieval performance.

For example, suppose that the CENTORD data source has an MDI with dimensions STATE, REGION, and PRODCAT. The MDI is used to retrieve the facts (LINEPRICE and QUANTITY data) that lie at the intersection of the dimension values specified in the following request:

TABLE FILE CENTORD
SUM QUANTITY LINEPRICE
WHERE REGION EQ 'EAST'
WHERE STATE EQ 'DC'
WHERE PRODCAT EQ 'Cameras'
END

The MDI also provides the following other retrieval enhancing features: MDI JOIN, Dimensional JOIN, MDI WITHIN, MAXVALUES, MDI Encoding, and AUTOINDEX for MDI.


Top of page

x
Specifying an MDI in the Access File

How to:

All MDI attributes are specified in the Access File for the data source. The only attribute needed in the Master File is the ACCESSFILE attribute to point to the Access File containing the MDI specifications.

An MDI can be partitioned into multiple MDI files. However, even if the data source on which the MDI is built is partitioned, each MDI partition spans all data source partitions.



x
Syntax: How to Specify an MDI in an Access File
MASTER = masterfile,$
    DATA = database_filename1,$
     .
     .
     .
    DATA = database_filenamen ,$
    MDI = mdiname, 
        TARGET_OF = segname ,$
        DIM = field1 [MAXVALUES = n1] [WITHIN = dimname1],$
          .
          .
          .
        DIM = fieldn [MAXVALUES = nn] [WITHIN = dimnamen],$
 
        MDIDATA = mdifile1 ,$
          .
          .
          .
        MDIDATA = mdifilen,$

where:

masterfile

Is the Master File name.

database_filename1, ..., database_filenamen

Are fully qualified physical file names in the syntax native to your operating environment. If the name contains blanks or special characters, it must be enclosed in single quotation marks. Multiple DATA declarations describe concatenated partitions.

mdiname

Is the logical name of the MDI.

segname

Is the segment that contains the facts pointed to by the MDI. If the target data is distributed among several segments, the target should be the top segment that contains MDI data in order to avoid the multiplicative effect.

field1, ..., fieldn

Are the fields to use as dimensions. At least two dimensions are required for an MDI.

mdifile1, ..., mdifilen

Are fully qualified physical file names for the MDI in the syntax native to your operating environment. If the name contains blanks or special characters, it must be enclosed in single quotation marks. Multiple MDIDATA declarations describe concatenated partitions.

n1, ..., nn

Is the number of distinct values the field can have. This number must be a positive integer.

dimname1, ..., dimnamen

Defines a hierarchy of dimensions. This dimension is defined within the dimname dimension. For example, CITY WITHIN STATE.



Example: Defining an MDI on UNIX

This example shows an MDI with two partitions:

MASTERNAME = CAR,$
 DATA = /user1/car.foc,$
 MDI = carmdi,
   TARGET_OF = ORIGIN,$
   DIM = CAR,$
   DIM = COUNTRY,$
   DIM = MODEL,$
   MDIDATA = /user1/car1.mdi,$
   MDIDATA = /user1/car2.mdi,$


Example: Defining an MDI on Windows

This example shows an MDI with two partitions:

MASTERNAME = CAR,$
 DATA = c:\user1\car.foc,$
 MDI = carmdi,
   TARGET_OF = ORIGIN,$
   DIM = CAR,$
   DIM = COUNTRY,$
   DIM = MODEL,$
   MDIDATA = c:\user1\car1.mdi,$
   MDIDATA = c:\user1\car2.mdi,$


Example: Defining an MDI on z/OS

This example shows an MDI with two partitions:

MASTER = CAR,$
 DATA = USER1.CAR.FOCUS,$
 MDI = CARMDI,
   TARGET_OF = ORIGIN,$
   DIM = CAR,$
   DIM = COUNTRY,$
   DIM = MODEL,$
   MDIDATA = USER1.CAR1.MDI,$
   MDIDATA = USER1.CAR2.MDI,$

Top of page

x
Creating a Multi-Dimensional Index

Each MDI is specified in an Access File for the data source. WebFOCUS uses the Access File in its retrieval analysis for each TABLE request.

You then use the REBUILD MDINDEX command to build the MDI. The MDI has the following DCB attributes: RECFM=F,LRECL=4096,BLKSIZE=4096.

A multi-dimensional index gives complex queries high-speed access to combinations of dimensions across data sources. If you know what information users want to retrieve and why, you can make intelligent choices about index dimensions.

An Access File can define more than one MDI. If the Access File defines multiple MDIs, the AUTOINDEX facility chooses the best index to use for each query.

The first step in designing an MDI is to find out what kind of information users need from the data source. You can get advice about your MDIs directly from WebFOCUS.


Top of page

x
Choosing Dimensions for Your Index

Reference:

The choice of index dimensions depends on knowing the data and on analyzing what is needed from it. Examine the record selection (IF and WHERE) tests in your queries to see how many index dimensions each application actually uses to select records. If different applications need different subsets of dimensions, consider separate MDIs for the separate subsets. Although WebFOCUS can produce high-speed reporting performance with indexes of up to 30 dimensions, smaller indexes usually generate less retrieval overhead. You can create an unlimited number of MDIs.

The following are good candidates for dimensions in an MDI:

Including a field that is updated frequently (such as an AUTODATE field) in the MDI, requires frequent rebuilding of the MDI in order to keep it current. WebFOCUS can advise you on selecting MDI dimensions.

DEFINE fields described in the Master File can be used as dimensions. Dynamic DEFINE fields cannot be dimensions.

An MDI is for retrieval only. FIND and LOOKUP are not supported on an MDI.



x
Reference: Guidelines for a Multi-Dimensional Index

The following guidelines apply to each MDI:


Top of page

x
Building and Maintaining a Multi-Dimensional Index

The REBUILD command is used to create or maintain a multi-dimensional index. This command can be issued in a FOCEXEC.

The best MDI is built by specifying the dimensions in order of best cardinality (most distinct values).

To issue the REBUILD command in a FOCEXEC, you place the REBUILD command and the user-supplied information needed for REBUILD processing in the FOCEXEC.

If the MDI file might be larger than two gigabytes or if you plan to add more data partitions to it, the MDI index file must be partitioned from the initial REBUILD phase. After the index has been created, you can use it in a retrieval request. You cannot use an MDI for modifying the data source. If you update the data source without rebuilding the MDI and then attempt to retrieve data with it, WebFOCUS displays a message indicating that the MDI is out of date. You must then rebuild the MDI.



Example: Creating a Multi-Dimensional Index

The following FOCEXEC creates the CARMDI MDI and contains each user-supplied value needed for REBUILD processing on a separate line of the FOCEXEC, entered in uppercase:

REBUILD
MDINDEX
NEW
CAR
CARMDI
NO

Top of page

x
Using a Multi-Dimensional Index in a Query

WebFOCUS allows you to use an MDI in a TABLE or SQL query. The performance is best when all of the dimensions in the MDI are used in selection criteria for the query.

There are two ways to use an MDI with a TABLE query:

You can use an MDI with an SQL query by issuing an SQL SELECT statement with a WHERE test using a field of an MDI. For example,

SELECT field_1, field_2 FROM table WHERE field_3 = value;

where field_3 is a dimension in an MDI.

Note: AUTOINDEX must be turned on for this feature to be operational.


Top of page

x
Querying a Multi-Dimensional Index

How to:

WebFOCUS provides a query command that generates statistics and descriptions for your MDIs. The command ? MDI allows you to display information about MDIs for a given FOCUS/XFOCUS Master File that hosts the target of your MDI.



x
Syntax: How to Query a Multi-Dimensional Index
? MDI mastername {mdiname|*} [HOLD [AS holdfile]]

where:

mastername

Is the logical name of the Master File. If you do not include any other parameters, a list of all MDI names specified is displayed with the command TARGET_OF in the Access File for this mastername. If the Access File for the mastername does not have any MDI information, a message will display.

mdiname

Is the logical name of an MDI. Specifying this parameter displays all the dimensions that are part of this MDI.

mdiname must be specified as TARGET_OF in the Access File for this mastername, or a message will display. If any of the dimensions are involved in a parent-child structure, a tree-like picture will display.

*

Displays a list of all dimensions, by MDI, whose targets are specified inside the Access File for this mastername.

HOLD

Saves the output in a text file.

holdfile

Is the file in which the output is saved. If this is not included with the AS phrase, the file is named HOLD.


Top of page

x
Using AUTOINDEX to Choose an MDI

How to:

When an Access File defines multiple MDIs, retrieval efficiency for a query may become a function of the index it uses to access the data source. The AUTOINDEX facility analyzes each retrieval request and chooses the MDI or B-tree index that provides the best access to the data. You can issue the AUTOINDEX command in a FOCEXEC or in a profile. The AUTOINDEX facility can be enabled or disabled. The default is to disable AUTOINDEX on reverse byte platforms and to enable it on forward byte platforms.

In its analysis, AUTOINDEX considers the following factors:

If the selection criteria in a request do not involve any MDI fields, WebFOCUS looks for an appropriate B-tree index to use for retrieval. If a field is both a B-tree index and a dimension in an MDI, the MDI is used for retrieval if two-thirds of the fields in selection tests are dimensions in the MDI. If it is less than two-thirds, the B-tree index is used. If there are multiple B-tree indexes, the one highest in the hierarchy is used.

If everything else is equal, WebFOCUS uses the first MDI it finds in the Access File.



x
Syntax: How to Enable or Disable AUTOINDEX for Retrieval
SET AUTOINDEX = {ON|OFF}

where:

ON

Optimizes MDI retrieval. AUTOINDEX can only be set to ON when the ALL parameter is set to OFF.

OFF

Disables the AUTOINDEX facility. No MDI will be used for retrieval.

Note: AUTOINDEX defaults to ON on reverse byte platforms and to OFF on forward byte platforms.

Note: WebFOCUS TABLE requests can automatically turn off AUTOINDEX and select the appropriate index for retrieval by indicating the index name in the request itself:

TABLE FILE filename.mdiname

You can also assure access with a specific MDI by creating an Access File that describes only that index.


Top of page

x
Joining to a Multi-Dimensional Index

How to:

Reference:

Joining to an MDI uses the power of the MDI to produce a fast, efficient join. Instead of joining one field in the source file to an indexed field in the target file, you can join to multiple dimensions of the MDI.

When the join is implemented, the answer set from the source file is created, and the values retrieved for the source fields serve as index values for the corresponding dimensions in the target MDI.

You can join to an MDI in two ways:

The source fields must form a one-to-one correspondence with the target dimensions. The MDI engine uses the source field values to get pointers to the target segment of the MDI, expediting data retrieval from the target file.

You can think of the source fields as mirror dimensions. If you put tighter constraints on the mirror dimensions, a smaller answer set is retrieved from the source file, and fewer index I/Os are required to locate the records from the target file. The speed of the join improves dramatically with the speed of retrieval of the source file answer set. Therefore, you can expedite any TABLE request against the source file by incorporating selection criteria on fields that participate in either a B-tree or MDI.

The following formula computes the time for a TABLE request that uses an MDI Join:

Total Time = Time to Retrieve the answer set from the source file (Ts)
+ Time to retrieve the MD index pointers (Tp)
+ Time to retrieve data from the target file (Tt)

Using a B-tree index or MDI in data retrieval reduces all types of retrieval time, reducing the total retrieval time.



x
Syntax: How to Join to All Dimensions of a Multi-Dimensional Index
JOIN field_1 [AND field_2 ...] IN sfile [TAG tag_1]
TO ALL mdiname IN tfile [TAG tag_2] [AS joinname]
[END]

where:

field_1, field_2

Are the join fields from the source file.

sfile

Is the source Master File.

tag_1, tag_2

Are one-character to eight-character names that can serve as unique qualifiers for field names in a request.

mdiname

Is the logical name of the MDI, built on tfile, to use in the join.

tfile

Is the target Master File.

joinname

Is a one-character to eight-character join name. A unique join name prevents a subsequent join from overwriting the existing join, allows you to selectively clear the join, and serves as a prefix for duplicate field names in a recursive join.

END

Is required to terminate the JOIN command if it is longer than one line.



x
Syntax: How to Create a Dimensional Join
JOIN field_1 [AND field_2 ...] IN sfile [TAG tag_1]
TO ALL dim_1 [AND dim_2 ...] IN tfile [TAG tag_2] [AS joinname]
[END]

where:

field_1, field_2

Are the join fields from the source file.

sfile

Is the source Master File.

tag_1, tag_2

Are one-character to eight-character names that can serve as unique qualifiers for field names in a request.

dim_1, dim_2

Are dimensions in tfile.

tfile

Is the target Master File.

joinname

Is a one-character to eight-character join name. A unique join name prevents a subsequent join from overwriting the existing join, allows you to selectively clear the join, and serves as a prefix for duplicate field names in a recursive join.

END

Is required to terminate the JOIN command if it is longer than one line.



x
Reference: Guidelines for Choosing a Source Field (Dimensional Joins Only)

Top of page

x
Encoding Values in a Multi-Dimensional Index

How to:

Reference:

WebFOCUS encodes indexed values any time a field or dimension of an MDI has a MAXVALUES attribute specified or is involved in a parent-child relationship. Encoded values are stored in the MDI file at rebuild time and can be retrieved and decoded with a TABLE request that specifies the MDIENCODING command. The MDIENCODING command allows the user to get output from the MDI file itself without having to read the data source.



x
Reference: Rules for Encoding a Multi-Dimensional Index

The following two rules apply to fields in a TABLE request that uses MDIENCODING:



x
Syntax: How to Retrieve Output From a Multi-Dimensional Index
SET MDIENCODING = {ON|OFF}

where:

ON

Enables retrieval of output from the MDI file without reading the data source.

OFF

Requires access of the data source to allow retrieval of MDI values.



x
Syntax: How to Encode a Multi-Dimensional Index
TABLE FILE mastername.mdiname request 
ON TABLE SET MDIENCODING ON
END

where:

mastername

Is the Master File.

mdiname

Is the logical name of the MDI.

request

Is the TABLE request that decodes the MDI.



Example: Encoding a Multi-Dimensional Index

The following examples show correct MDI encoding:

TABLE FILE COMPANY.I DATA1
PRINT CITY BY STATE
ON TABLE SET MDIENCODING ON
END
TABLE FILE COMPANY.I DATA1
COUNT CITY
IF STATE EQ NY
ON TABLE SET MDIENCODING ON
END
TABLE FILE COMPANY.I DATA1
PRINT CATEGORY
ON TABLE SET MDIENCODING ON
END

The following example is incorrect because CATEGORY is not part of the CITY-STATE hierarchy.

TABLE FILE COMPANY.I DATA1
PRINT CITY BY STATE
IF STATE EQ NY
IF CATEGORY EQ RESTAURANT
ON TABLE SET MDIENCODING ON
END


Example: Using a Multi-Dimensional Index in a Request

The following TABLE request accesses the CAR data source. It will use the CARMDI index for retrieval because CARMDI is the only MDI described in the Master File:

TABLE FILE CAR
SUM RETAIL_COST DEALER_COST
BY BODYTYPE
-* WHERE Condition utilizing MDI fields:
WHERE (COUNTRY EQ 'JAPAN' OR 'ENGLAND')
AND (CAR EQ 'TOYOTA' OR 'JENSEN' OR 'TRIUMPH')
AND (MODEL EQ 'COROLLA 4 DOOR DIX AUTO'
OR 'INTERCEPTOR III' OR 'TR7')
END

Top of page

x
Partitioning a Multi-Dimensional Index

If the data source has grown due to the addition of new data partitions, and these partitions need to be added to the MDI, you must perform the following steps:

  1. Update the Access File to include the new data partitions.
  2. Verify that your MDI is partitioned. Remember that the ADD function of the REBUILD utility cannot be executed on a non-partitioned MDI.
  3. Perform the REBUILD, MDINDEX, and ADD on the MDI.


Example: Adding a Partition to a Multi-Dimensional Index

The following FOCEXEC contains commands that add a partition to a multi-dimensional index named CARMDI defined on the CAR data source:

REBUILD
MDINDEX
ADD
CAR
CARMDI
NO

After the MDI is rebuilt to include the new data partitions, any retrieval query that uses the MDI will use the newly added data partitions within that MDI.


Top of page

x
Querying the Progress of a Multi-Dimensional Index

How to:

Use the SET MDIPROGRESS command to view messages about the progress of your MDI build. The messages will show the number of data records accumulated for every n records inserted into the MDI as it is processed.



x
Syntax: How to Query the Progress of a Multi-Dimensional Index
SET MDIPROGRESS = {0|n}

where:

n

Is an integer greater than 1000, which displays a progress message for every n records accumulated in the MDI build. 100,000 is the default value.

0

Disables progress messages.


Top of page

x
Displaying a Warning Message

How to:

The SET MDICARDWARN command displays a warning message every time a dimension cardinality exceeds a specified value, offering you the chance to study the MDI build. When the number of equal values of a dimension data reaches a specified percent, a warning message will be issued. In order for MDICARDWARN to be reliable, the data source should contain at least 100,000 records.

Note: In addition to the warning message, a number displays in brackets. This number is the least number of equal values for the dimension mentioned in the warning message text.



x
Syntax: How to Display a Warning Message
SET MDICARDWARN = n

where:

n

Is a percentage value from 0 to 50.


WebFOCUS