SUBSTV: Extracting a Variable Length Substring

How to:

The SUBSTV function extracts a substring from a string and is similar to SUBSTR. However, the end position for the string is calculated from the starting position and the substring length. Therefore, it has fewer parameters than SUBSTR. Also, the actual length of the output field, if it is an AnV field, is determined based on the substring length.

Syntax: How to Extract a Variable Length Substring

SUBSTV(upper_limit, source_string, start, sub_limit, output)

where:

upper_limit

Integer

Is the limit for the length of the source string.

source_string

Alphanumeric of type An or AnV

Is the character string that contains the substring you want to extract. It can be the string enclosed in single quotation marks ('), or the field containing the string. If it is a field, it can have An or AnV format. If it is a field of type AnV, its length is taken from the length bytes stored in the field. If upper_limit is smaller than the actual length, the source string is truncated to the upper limit. The final length value determined by this comparison is referred to as p_length (see the description of the output parameter for related information).

start

Integer

Is the starting position of the substring in the source string. The starting position can exceed the source string length, which results in spaces being returned.

sub_limit

Integer

Is the length, in characters, of the substring. Note that the ending position can exceed the input string length depending on the provided values for start and sub_limit.

output

Alphanumeric of type An or AnV

Is the field to which the result is returned, or the format of the output value enclosed in single quotation marks ('). This field can be in An or AnV format.

If the format of output is AnV, and assuming end is the ending position of the substring, the actual length, outlen, is computed as follows from the values for end, start, and p_length (see the source_string parameter for related information):

If end > p_length or end < start, then outlen = 0. Otherwise, outlen = end - start + 1.

Example: Extracting a Variable Length Substring

The following request extracts a trailing definite or indefinite article from a movie title (such as ", THE" in "SMURFS, THE"). First it trims the trailing blanks so that the article is the trailing pattern. Next it finds the starting position and length of the pattern. Then SUBSTV extracts the pattern and TRIMV trims the pattern from the title:

DEFINE FILE MOVIES
  TITLEV/A39V = TRIMV('T',TITLE, 39,' ', 1, TITLEV);
  PSTART/I4 = POSITV(TITLEV,LENV(TITLEV,'I4'), ',', 1,'I4');
  PLEN/I4 = IF PSTART NE 0 THEN LENV(TITLEV,'I4') - PSTART +1
                    ELSE 0;
  PATTERN/A20V= SUBSTV(39, TITLEV, PSTART, PLEN, PATTERN);
  NEWTIT/A39V = TRIMV('T',TITLEV,39,PATTERN,LENV(PATTERN,'I4'), NEWTIT);
END
TABLE FILE MOVIES
  PRINT TITLE
   PSTART AS 'Pattern,Start' IN 25
   PLEN AS 'Pattern,Length'
  NEWTIT AS 'Trimmed,Title' IN 55
BY CATEGORY  NOPRINT
WHERE PLEN NE 0
END

The output is:

                        Pattern Pattern  Trimmed
TITLE                     Start  Length  Title
-----                    ------ -------  -------
SMURFS, THE                   7       5  SMURFS
SHAGGY DOG, THE              11       5  SHAGGY DOG
MALTESE FALCON, THE          15       5  MALTESE FALCON
PHILADELPHIA STORY, THE      19       5  PHILADELPHIA STORY
TIN DRUM, THE                 9       5  TIN DRUM
FAMILY, THE                   7       5  FAMILY
CHORUS LINE, A               12       3  CHORUS LINE
MORNING AFTER, THE           14       5  MORNING AFTER
BIRDS, THE                    6       5  BIRDS
BOY AND HIS DOG, A           16       3  BOY AND HIS DOG