Element Data Format Parameters

In this section:

Data format parameters are parameters used for data formatting when an internal/external data format conversion is required. This task typically occurs when DQC needs to load data from an external file or needs to store data to an external file. Data format parameters then describe how to convert data to the desired form based on the reading and writing file data formats.

Processing steps supporting DataFormatParameters (DFP) can define DFP at the top level of a step as well as on the "local" level of each column. If there are no DFP defined on the local level, the global DFP are used. If global data format parameters are not defined, the default values are assumed. When a column defines its own DFP, this DFP must contain all attributes needed for successful parsing (those attributes must be assigned valid values). The only exception to this rule is thousandsSeparator. In contrast with the rest of DFP attributes that must always have some value, thousandsSeparator may remain empty (meaning that no thousands separator is used).

Name

Type

Required

Description

Date Format Locale

String

No

Defines the locale for parsing non-numerical data (for example short forms of months in dates - e.g., Sep 18, 1999). The value is the same as the value of the corresponding locale in Java (for detail see Java locales).

Default value: en_US (English - United States)

Date Time Format

String

No

Defines the format that should be used for processing 'datetime' data. The template is based on SimpleDateFormat, which uses the Java convention.

Default value: yyyy-MM-dd HH:mm:ss

Day Format

String

No

Defines the format that should be used for processing 'day' data. The template is based on SimpleDateFormat, which uses the Java convention.

Default value: yyyy-MM-dd

Decimal Separator

String

No

Defines what character is used as the decimal separator. A non-escaped character is expected. This attribute is reserved for future use as there is currently no data type supporting this attribute.

Default value: .

False Value

String

No

String value that represents a logical 'false' value in the given data. The comparison is not case sensitive.

Default value: false

Thousands Separator

String

No

Defines the string that represents the thousands separator (used in numbers). A non-escaped character is expected. Numbers to be read do not need to contain this separator, but when the separator is present it is processed (stripped) accordingly.

Default value: ,

True Value

String

No

String value representing a logical 'true' value in the given data. The comparison is not case sensitive.

Default value: true



x
Detailed Description of Element Data Format Parameters

Information about formatting dates

For datetime and day types there is an existing output format defined. This format definition is given by a textual string that follows the convention used in the Java SimpleDateFormat.

The meanings of the most used characters in the formatting string are as follows:

y ... year, M ... month in year, d ... day in month.

By changing the number of those formatting characters it is possible to change the meaning of the characters.

Note: the formatting string is case sensitive, so the same character has different meaning when used as upper or lower case.

Reading a year:

If the formatting string contains more than two characters, then the input number is interpreted as is, without accounting for the century.

If the formatting string contains a shortened version ("y" or "yy"), then the input number is parsed against the current data where the resulting date is in the range -80, +20 years from the current date. It is important to note that the number of digits in the year must strictly comply with the number of its formatting characters. Otherwise, the year is read as specified in the input (for format '..yy' and input '...765' the resulting year is 765.

Year output format:

If the formatting string includes the "yy" form, then the year is transformed into 2 characters, otherwise it is formatted as the original number (including completion).

Month output format:

If the number of formatting characters for month is >=3, then the output is the full month's name (for example, January), otherwise the output is represented by the number (so, for January, it is 1).

Numerical output format:

Numerical padding functionality dictates there must be at least as many characters representing the digits of the number as contained in the biggest number to be displayed. Shorter numbers are filled with zeroes from the left.

When formatting numbers, a number is represented by a template containing the sequence of formatting characters. Each formatting character in the template represents one digit of the number. Usually it is not necessary to specify the number of digits in the template exactly (a number can have a different number of digits and will still be parsed correctly). The only exception is when there are two neighboring numbers to be parsed. In order to determine which character belongs to which number, the parsing templates are applied in their exact form (and therefore the number of characters in the template matters).

Processing date from text:

If at least 4 formatting characters are specified, then the full text form is used, otherwise the short form is used (or at least shortened - January/Jan, etc). For reading, the number of formatting characters does not matter and all forms are accepted.

Example of input: "yyyy-MM-dd" - accepts input "1970-01-01"

Example of output: "d.M.yyyy" - output is 1.1.1970

Information about parsing boolean values

It is possible to define an input/output format for types with the form "true-text | false-text", where true-text (or false-text respectively) is the string representing true ' (or false) values. By default these values are set to "true" and "false."


iWay Software