GETTOK: Extracting a Substring (Token)

How to:

The GETTOK function divides a character string into substrings, called tokens. The data must have a specific character, called a delimiter, that occurs in the string and separates the string into tokens. GETTOK returns the token specified by the token_number argument. GETTOK ignores leading and trailing blanks in the source character string.

For example, suppose you want to extract the fourth word from a sentence. In this case, use the space character for a delimiter and the number 4 for token_number. GETTOK divides the sentence into words using this delimiter, then extracts the fourth word. If the string is not divided by the delimiter, use the PARAG function for this purpose. See PARAG: Dividing Text Into Smaller Lines.


Top of page

x
Syntax: How to Extract a Substring (Token)
GETTOK(source_string, inlen, token_number, 'delim', outlen, output)

where:

source_string

Alphanumeric

Is the source string from which to extract the token.

inlen

Integer

Is the number of characters in source_string. If this argument is less than or equal to 0, the function returns spaces.

token_number

Integer

Is the number of the token to extract. If this argument is positive, the tokens are counted from left to right. If this argument is negative, the tokens are counted from right to left. For example, -2 extracts the second token from the right. If this argument is 0, the function returns spaces. Leading and trailing null tokens are ignored.

'delim'

Alphanumeric

Is the delimiter in the source string enclosed in single quotation marks. If you specify more than one character, only the first character is used.

Note: In Dialogue Manager, to prevent the conversion of a delimiter space character (' ') to a double precision zero, include a non-numeric character after the space (for example, '%'). GETTOK uses only the first character (the space) as a delimiter, while the extra character (%) prevents conversion to double precision.

outlen

Integer

Is the size of the token extracted. If this argument is less than or equal to 0, the function returns spaces. If the token is longer than this argument, it is truncated; if it is shorter, it is padded with trailing spaces.

output

Alphanumeric

Is the name of the field that contains the token, or the format of the output value enclosed in single quotation marks. The delimiter is not included in the token.

Note that the delimiter is not included in the extracted token.



Example: Extracting a Token

GETTOK extracts the last token from ADDRESS_LN3 and stores the result in LAST_TOKEN.

The delimiter is a space:

TABLE FILE EMPLOYEE
PRINT ADDRESS_LN3 AND COMPUTE
LAST_TOKEN/A10 = GETTOK(ADDRESS_LN3, 20, -1, ' ', 10, LAST_TOKEN);
AS 'LAST TOKEN,(ZIP CODE)'
WHERE TYPE EQ 'HSM';
END

The output is:

                      LAST TOKEN
ADDRESS_LN3           (ZIP CODE)
-----------           ----------
RUTHERFORD NJ 07073   07073
NEW YORK NY 10039     10039
FREEPORT NY 11520     11520
NEW YORK NY 10001     10001
FREEPORT NY 11520     11520
ROSELAND NJ 07068     07068
JERSEY CITY NJ 07300  07300
FLUSHING NY 11354     11354

Information Builders