In this section: |
You can read files in which fields are separated by any type of delimiter including commas, tabs and other characters. Defining a Master File with the SUFFIX=DFIX attribute lets you specify any combination of characters as the field delimiter. Delimiters may consist of printable or non-printable characters, or any combination of printable and non-printable characters.
Two methods of describing delimited files are supported:
Note: SET HOLDLIST is not supported for delimited files.
How to: Reference: |
Delimiters in the Master File are defined using a special field named DELIMITER. The FILE declaration must include the attribute SUFFIX=DFIX.
Describe the delimiter characters in a special field or group named DELIMITER. The delimiter characters are specified in the ALIAS attribute of this special field or group.
To use a delimiter that consists of a single non-printable character or of one or more printable characters, the delimiter is defined as a field with the following attributes:
FIELDNAME=DELIMITER, ALIAS=delimiter, USAGE=ufmt, ACTUAL=afmt ,$
To use a delimiter that consists of multiple non-printable characters or a combination of printable and non-printable characters, the delimiter is defined as a group:
GROUP=DELIMITER, ALIAS= , USAGE=ufmtg, ACTUAL=afmtg ,$ FIELDNAME=DELIMITER, ALIAS=delimiter1, USAGE=ufmt1, ACTUAL=afmt1 ,$ . . . FIELDNAME=DELIMITER, ALIAS=delimitern, USAGE=ufmtn, ACTUAL=afmtn ,$
where:
Indicates that the field or group is used as the delimiter in the data source.
Identifies a delimiter, up to 30 characters long. For one or more printable characters, the value consists of the actual characters. The delimiter must be enclosed in single quotation marks if it includes characters used as delimiters in Master File syntax. For a non-printable character, the value is the decimal equivalent of the EBCDIC or ASCII representation of the character, depending on your operating environment.
Are the USAGE and ACTUAL formats for the delimiter. Possible values are:
Type of delimiter | USAGE | ACTUAL |
---|---|---|
Printable characters | An where n is the number of characters | An where n is the number of characters |
Non-printable character such as Tab | I4 | I1 |
Group (combination of printable and non-printable characters, or multiple non-printable characters) | Sum of the individual USAGE lengths | Sum of the individual ACTUAL lengths |
The following example shows a one-character alphanumeric delimiter:
FIELDNAME=DELIMITER, ALIAS=',' ,USAGE=A1, ACTUAL=A1 ,$
The following example shows a two-character alphanumeric delimiter:
FIELDNAME=DELIMITER, ALIAS=// ,USAGE=A2, ACTUAL=A2 ,$
The following example shows how to use the Tab character as a delimiter:
FIELDNAME=DELIMITER, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$
The following example shows how to use a blank character described as a numeric delimiter:
FIELDNAME=DELIMITER, ALIAS=64 ,USAGE=I4, ACTUAL=I1 ,$
The following example shows a group delimiter (Tab-slash-Tab combination):
GROUP=DELIMITER, ALIAS= ,USAGE=A9, ACTUAL=A3 ,$ FIELDNAME=DEL1, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$ FIELDNAME=DEL2, ALIAS=/ ,USAGE=A1, ACTUAL=A1 ,$ FIELDNAME=DEL3, ALIAS=05 ,USAGE=I4, ACTUAL=I1 ,$
The following Master File shows the MISSING attribute specified for the CAR field:
FILE=DFIXF01 ,SUFFIX=DFIX SEGNAME=SEG1 ,SEGTYPE=S0 FIELDNAME=COUNTRY ,ALIAS=F1 ,USAGE=A10 ,ACTUAL=A10 ,$ FIELDNAME=CAR ,ALIAS=F2 ,USAGE=A16 ,ACTUAL=A16 ,MISSING=ON, $ FIELDNAME=NUMBER ,ALIAS=F3 ,USAGE=P10 ,ACTUAL=Z10 ,$ FIELDNAME=DELIMITER ,ALIAS=',' ,USAGE=A1 ,ACTUAL=A1 ,$
In the source file, two consecutive comma delimiters indicate missing values for CAR:
GERMANY,VOLKSWAGEN,1111 GERMANY,BMW, USA,CADILLAC,22222 USA,FORD USA,,44444 JAPAN ENGLAND, FRANCE
The output is:
COUNTRY CAR NUMBER ------- --- ------ GERMANY VOLKSWAGEN 1111 GERMANY BMW 0 USA CADILLAC 22222 USA FORD 0 USA . 44444 JAPAN . 0 ENGLAND 0 FRANCE . 0
Reference: |
The Master File has the standard attributes for any sequential file. The SUFFIX value is DFIX. All of the delimiter information is in the Access File.
In addition, you can use the HOLD FORMAT DFIX command to create this type of Master and Access File for a token-delimited file. For information on HOLD formats, see the Creating Reports manual.
DELIMITER = delimiter, [ENCLOSURE = character,] [HEADER = {YES|NO}],$
where:
Is the delimiter sequence consisting of up to 30 printable or non-printable non-null characters. For a non-printable character, enter the hexadecimal value that represents the character. If you use a mixture of printable and non-printable characters, you must enter them all as hexadecimal values. To create a tab-delimited file, you can specify the delimiter value as TAB or as its hexadecimal equivalent (0x09 on ASCII platforms or 0x05 on EBCDIC platforms).
Note that numeric digits and symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-) cannot be used in the delimiter sequence.
Is the enclosure sequence. It can be up to four printable characters used to enclose each alphanumeric value in the file. Most alphanumeric characters can be used as all or part of the enclosure sequence. However, numeric digits and symbols used in numbers, such as a period (.), plus sign (+), or minus sign (-) cannot be used in the enclosure sequence. Also note that, in order to specify a single quotation mark as the enclosure character, you must enter four consecutive single quotation marks. The most common enclosure is one double quotation mark.
Specifies whether to include a header record that contains the names of the fields in the delimited sequential file generated by the request. NO is the default value.
The pipe delimited file named PIPE1 contains the following data in which each data value is delimited by a pipe character (|). Note that you can create a delimited file as output from a request using the HOLD FORMAT DFIX command in a request:
EAST|2000|3907|1145655.77 EAST|2001|495922|127004359.88 EAST|2002|543678|137470917.05 NORTH|2001|337168|85750735.54 NORTH|2002|370031|92609802.80 SOUTH|2000|3141|852550.45 SOUTH|2001|393155|99822662.88 SOUTH|2002|431575|107858412.0 WEST|2001|155252|39167974.18 WEST|2002|170421|42339953.45
The PIPE1 Master File is:
FILENAME=PIPE1 , SUFFIX=DFIX , $ SEGMENT=PIPE1, SEGTYPE=S2, $ FIELDNAME=REGION, ALIAS=E01, USAGE=A5, ACTUAL=A05, $ FIELDNAME=YEAR, ALIAS=E02, USAGE=YY, ACTUAL=A04, $ FIELDNAME=QUANTITY, ALIAS=E03, USAGE=I8C, ACTUAL=A08, $ FIELDNAME=LINEPRICE, ALIAS=E04, USAGE=D12.2MC, ACTUAL=A12, $
The PIPE1 Access File is:
SEGNAME=PIPE1, DELIMITER=|, HEADER=NO, $
In the following version of the PIPE1 delimited file, each alphanumeric value is enclosed in double quotation marks:
"EAST"|2000|3907|1145655.77 "EAST"|2001|495922|127004359.88 "EAST"|2002|543678|137470917.05 "NORTH"|2001|337168|85750735.54 "NORTH"|2002|370031|92609802.80 "SOUTH"|2000|3141|852550.45 "SOUTH"|2001|393155|99822662.88 "SOUTH"|2002|431575|107858412.01 "WEST"|2001|155252|39167974.18 "WEST"|2002|170421|42339953.45
The Master File does not change, but the Access File now specifies the enclosure character:
SEGNAME=PIPE1, DELIMITER=|, ENCLOSURE=", $
In this version of the PIPE1 delimited file, the first record in the file specifies the name of each field, and each alphanumeric value is enclosed in double quotation marks:
"REGION"|"YEAR"|"QUANTITY"|"LINEPRICE" "EAST"|2000|3907|1145655.77 "EAST"|2001|495922|127004359.88 "EAST"|2002|543678|137470917.05 "NORTH"|2001|337168|85750735.54 "NORTH"|2002|370031|92609802.80 "SOUTH"|2000|3141|852550.45 "SOUTH"|2001|393155|99822662.88 "SOUTH"|2002|431575|107858412.01 "WEST"|2001|155252|39167974.18 "WEST"|2002|170421|42339953.45
The Master File remains the same. The Access File now specifies that there is a header record in the data file:
SEGNAME=PIPE1, DELIMITER=|, ENCLOSURE=", HEADER=YES, $
Information Builders |