Encoding Functions

In this section:

 

Encoding functions enable you to decode and convert strings. This section lists and describes the various encoding functions that you can use in iWay Service Manager.


Top of page

x
_mod10(): Mod10 Check Digit Operations

The _mod10() function generates or checks a modulus 10 digit. This function uses the following format:

_mod10(action,value)

action

keyword

Specifies the action to be performed:

  • Append. Add the valid check digit to the value.
  • Check. Validate the check digit within the value (true or false).
  • Generate. Generate and return the check digit for the value (single integer).

value

string

A numeric value to be used for check digit activities.

A modulus 10 check (also known as Luhn's Algorithm) is a simple checksum used to validate a variety of common identification numbers, such as credit card numbers, National Provider Identifiers in the US, DUNS numbers, and so on. The algorithm is specified in ISO/IEC 7812-1. It is designed to protect against accidental errors such as simple transpositions rather than malicious attacks. The formula verifies a number against its included check digit.


Top of page

x
_url(): Convert String to MIME Format

The _url() function converts the URL string to the application/x-www-form-urlencoded MIME format. For more information about HTML form encoding, consult the HTML specification. The _url() function uses the following format:

_url(URLString [,encoding])

URLString

string

The string to convert.

encoding

string

IANA encoding for the string.

When encoding the string, the following rules apply:

For example, using UTF-8 as the encoding scheme the string The string ü@foo-bar would get converted to The+string+%C3%BC%40foo-bar because in UTF-8 the character ü is encoded as two bytes C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).

Note: The World Wide Web Consortium Recommendation states that UTF-8 should be used. Not doing so may introduce incompatibilities. For this reason, UTF-8 is the default encoding regardless of the encoding under which the listener is running.

If the function determines that the passed string is a valid URL, it encodes only the portion following the '?'. This is called the <query> in the URL specification. Otherwise, it encodes the complete string.

For example, the URL http://localhost:1456?value=1 test=2 will encode to http://localhost:1456?value=1+test=2.


Top of page

x
_urlencode(): Convert String to MIME Encoding

The _urlencode() function converts the full passed in string to the application/x-www-form-urlencoded MIME format. For more information about HTML form encoding, consult the HTML specification. The string is not checked for URL format. The _urlencode() function uses the following format:

_urlencode(String [,encoding])

String

string

The string to convert.

encoding

string

IANA encoding for the string.

When encoding the string, the following rules apply:

Note: The World Wide Web Consortium Recommendation states that UTF-8 should be used. Not doing so may introduce incompatibilities. For this reason, UTF-8 is the default encoding regardless of the encoding under which the listener is running.

Unlike the _url() function, no effort is made to validate the input string. Instead, it encodes the complete string. For example, http://localhost:1456?value=1 test=2 will encode to http%3A%2F%2Flocalhost%3A1456%3Fvalue%3D1+test%3D2.


Top of page

x
_urldecode():Decode a String in MIME Format

The _urldecode() function decodes a string from the application/x-www-form-urlencoded MIME format into standard format for use as a parameter, inclusion in an XML value, and so on. It uses the following format:

_urldecode(URL String [,encoding])

URLString

string

The string to convert.

encoding

string

IANA encoding for the string.

The conversion process is the reverse of that used by the _urlencode() function. It is assumed that all characters in the encoded string are one of the following: "a" through "z", "A" through "Z", "0" through "9", and "-", "_", ".", and "*". The character "%" is allowed but is interpreted as the start of a special escaped sequence.

If the encoding is not specified, UTF-8 is assumed, in accord with the recommendations as described by the _urlencode() function.


Top of page

x
_base64():Encode Into Base64

The _base64() function uses the following format:

_base64(value)

value

string

The value to encode.

encoding

string

The encoding to be used in creating the base64.

The input may be represented in a non-server encoding. To set the encoding for the conversion, the encoding parameter must be used.

For example, if you want to transfer the current message (document payload) to a third-party in base64 form, configure the function as follows:

_base64(_flatof(),_docinfo('encoding'))

Top of page

x
_frombase64():Decode From Base64

The _frombase64() function uses the following format:

_frombase64(value)

value

string

The string to convert.

encoding

string

The encoding to be used in creating the base64.

The passed value is converted from base64 representation to standard notation.


Top of page

x
_encode64():Conditionally Encode Into Base64

The _encode64() function uses the following format:

_encode64(value)

value

string

The string to convert.

encoding

string

The encoding to be used in creating the base64.

If the value requires base64 encoding it is converted to base 64, else it is returned with no conversion. Examples of values that need base64 conversion include those with values lower than 0x20.

If conversion is required, the converted value is enclosed in base64() functional notation.


Top of page

x
_decode64(): Conditionally Decode From Base64

The _decode64() function uses the following format:

_decode64(value)

value

string

The string to convert.

encoding

string

The encoding to be used in creating the base64.

If the input value is enclosed in base64() functional notation. it is converted. Otherwise, it is not changed.

Example 1:

_decode64('base64(YWJj)')

In this example, the string is decoded as 'abc'.

Example 2:

_decode64('abcd')

In this example, the string is not decoded since it is not enclosed in the base64 tag.


Top of page

x
_fmtdec(): Insert an Integer Into a Pattern Mask

The _fmtdec() function is useful when a value must be in a specific format. It uses the following format:

_fmtdec(pattern,intval)

pattern

string

Define the string to be created.

intval

integer

Value to be inserted.

The value is inserted into the pattern mask to form a complete result. The mask consists of alphabetic and numeric characters and special symbols as defined for Java formatting. When the value is inserted, the appropriate pattern characters are replaced with the value. For example _fmtdec('ab##.#x',17.3) yields ab17.3x.


Top of page

x
_fmtint(): Insert an Integer Into a Pattern Mask

The _fmtint() function is useful when a value for a control number is read from the trading partner manager or another source. It uses the following format:

_fmtint(pattern,intval)

pattern

string

Define the string to be created.

intval

integer

Value to be inserted.

The integer is inserted into the pattern mask to form a complete result. The mask consists of alphabetic and numeric characters and special symbols. It also should contain one sequence of # characters. When the integer is inserted, the # characters are replaced with the integer. For example _fmtint('ab###x',17) yields ab017x.


Top of page

x
_urlparse() Extract Portions of a URL/URI
_urlparse(URL String, component [,query_kw [,default]])

URLString

string

The string to parse.

component

string

The name of the desired component.

query_kw

string

A keyword to be located in the query portion of the URL.

default

string

Value returned if the query keyword is not found.

The Uniform resource Locator/Identified is parsed in order to extract useful pieces. The components are as described in RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax (http://www.ietf.org/rfc/rfc2396.txt).

The component parameter, which is required, can be one of these RFC-identified words:

When parsing for the query component, two additional parameters are supported. The first is a keyword to be located, and the second is a default. The keyword is a URL keyword contained in the query. If the keyword is not found, the default is returned. If the default is not present, an empty string is returned. For example:

_urlparse('http://www.url.com/look?q=iway','query','q','hello')

yields iway.

In addition, there are two more keywords for simplicity of use:

Note that file and filename are not synonyms. File returns the URL path plus any query string.

To extract specific portions of the returned information, the _token() function can be used.


Top of page

x
_deflate(): Compress (Deflate) a Value
_deflate(value [,encoding ][,output type] [,algorithm modifier])

value

String

The string value to be compressed.

encoding

The character set of the input string. The default is ISO-8859-1.

output type

keyword

The format of the output resulting string. The default is leadhex. For more information on the supported output types, see the table below.

algorithm modifier

keyword

The algorithm to be used. The default is standard. For more information on the supported algorithm modifier types, see the table below.

A string value such as a flattened XML tree is compressed using standard ZIP algorithms. The compression result is expressed as a Unicode string in a designated format. This string is appropriate for database updates into a varbinary column, for transmission, or for other storage.

The compression operation first converts the string to a byte representation based on the provided encoding. It then applies compression algorithms, and once compressed, the result is converted back to a string under encoding ISO-8859-1 in a requested format. The default format is leadhex (for example, 0x010203…) appropriate for direct insertion into most databases.

The supported output types are listed and described in the following table. The default is leadhex.

Output Type

Description

rawhex

The deflated bytes are represented as hexadecimal digits, two per byte.

leadhex

The deflated values are represented as hexadecimal digits, two per byte. The result is prepended with the two characters 0x creating a value appropriate for most database inserts. For example, using the SQL service object (com.ibi.agents.XDSQLAgent) in iIT to generate an insert for a table with two columns, an integer and a varbinary:

SQL INSERT INTO MYTABLE (INTCOL, VBCOL) VALUES(%INTX, %VB)

might result in the following:

SQL INSERT INTO MYTABLE (INTCOL, VBCOL) VALUES(1, 0X1234556788)

For more information on setting insert values using the SQL service (com.ibi.agents.XDSQLAgent), see the iWay Service Manager Component Reference Guide.

base64

The deflated bytes are represented in base64, with no iSM prefix (for example, 076572dfhe=). Typically, base64 representation results in a shorter string than with hex representation.

func64

The deflated values are represented in base64, encased in iSM base64 marker prefix. For example:

base64(076572dfhe=)

The supported algorithm modifier types are listed and described in the following table. The default is standard.

Algorith m Modifier

Description

standard

The default compression level. This is usually a good match for balancing performance with the size of the compressed result.

fastest

The compression uses fewer resources, but possibly at the expense of compression size.

smallest

The compression results in a smaller result, but may require additional time to complete the operation.

huffman

An entropic encoding algorithm well oriented to English language text.

none

No compression is performed. This is useful for diagnostic and testing purposes only.

The following is an example of the _deflate function:

_deflate (_flatof(),,'base64','smallest')

Top of page

x
_inflate(): Inflate a Value
_inflate(value, type)

value

string

The deflated value expressed as a string.

type

keyword

The representation type of the string. The following types are supported:

  • string. Analyze the value looking for type markers (default).
  • base64. The string is encoded in base64, either with or without the base64() markers.
  • leadhex. The value is hex characters (for example, 010a45), starting with 0x.
  • rawhex. The value is hex characters without the 0x marker.

The input is assumed to be a string version produced from a deflated message. How the string is created will depend on the input document, but can be expected to be either a base64 value or a string simply made from a byte array. Users are cautioned that if the input is in base64 format, do not attempt to use the _frombase64() function to preconvert the input to string.

The standard representation of a database varbinary column a read in iSM (SQL listeners, XDSQLAgent, and so on) is marked base64. For example:

base64(71889875rdj02=)

It is therefore in a format that can be recognized without a type operand.

The standard ZIP inflate algorithms are attempted, and if successful the result returned is the inflated string.


Top of page

x
Working with BLOBs and Varbinary

Although some databases automatically compress and decompress character data (text or clob columns), others do not. For applications that expect to store large amounts of textual (string) data, the iFL functions _deflate() and _inflate() are available. For example, to store the current document into a nullable varbinary column, the name/value tokens for the insert statement might be:

thetree

_deflate()

The input from a BLOB or varbinary field is returned from iSM readers as framed base64. This can be passed into the _inflate() iFL function, which automatically recognizes the framing and decompresses the information back to the original data string. For more information, see _deflate(): Compress (Deflate) a Value and _inflate(): Inflate a Value.


iWay Software