Detailed Description of Fixed Width File Reader

A text file is expected as the input of this step, containing data presented line-by-line. Each line may contain several fields (columns) that are identified by their start position and length. Values are not separated by any separator, the data start and stop positions on the line are figured out from the current position (which depends on fields that have been read so far) and the column's attributes (skip and length).

Similar to Text File Reader, this step supports all encodings supported by Java, including Unicode formats (supported Unicodes are: UTF-8, UTF-16, UTF-16BE, UTF-16LE). Input data is processed the same way (via a Unicode aware reader), allowing correct processing of all files including those ones with a Byte Order Mark (BOM) signature. More detailed information about supported encodings can be found in the Text File Reader step.

The line structure is described by individual column definitions. Each column must define:

Values specified here are relative values from the end of the previous (last) field. The step computes absolute values from the sequence of fields, therefore the fields must be defined in an order matching the data record column order.

Input data are processed utilizing parameters specified in the dataFormatParameters element. For a detailed description refer to DataFormatParameters.

This step may produce the following errors: SHORT_LINE, INVALID_DATE, UNPARSABLE_FIELD, LONG_LINE, EXTRA_DATA, PROCESSING_ERROR.

If a SHORT_LINE error occurs, the value given to further processing depends on the SHORT_LINE instruction's action. If the action is NULL_VALUE then null is passed to further parsing. If the action is READ_POSSIBLE then the returned value represents the value read from the field: it means either null if there were no data for the given field or a non-null value if data were present, but insufficient, in the field.

Error management is defined by the element errorHandlingStrategy. The error handling strategy allows the step to exclude incorrect records from processing, and such records can be sent to the "rejected" output file. For more information about error handling consult data parsing strategies.

When creating a reject file, the following rules are observed:

The following example assumes that the input line consists of the following fields:

DAY, DATETIME and STRING - configured as in the example below.

Consider the following input data:

11.12.2001xxx12-10-2000 12:51+GMT0----|||***This is a test***

The processing ouput is (assuming that DATETIME uses the yyyy-MM-dd HH:mm:ss format,DAY uses yyyy-MM-dd formatting and the delimiter ';'):2001-12-11;2000-10-12 12:51:00;This is a test


Top of page

Example: Example
<step className="cz.adastra.cif.tasks.io.text.read.FixedWidthFileReader" id="fwReader">
     <properties>
        <numberOfLinesInHeader>1</numberOfLinesInHeader>
        <numberOfLinesInFooter>0</numberOfLinesInFooter>
        <encoding>windows-1250</encoding>
        <fileName>someOutputFileName.csv</fileName>
        <lineSeparator>\r\n</lineSeparator>
        <lineMaxReadLength>65536<lineMaxReadLength>
        <columns>
            <fixedWidthColumn width="10" skip="0" type="DAY" name="cDay"
                              ignore="false" fillChar=" ">
                <dataFormatParameters dateFormatLocale="en_US"
                                      dayFormat="dd.MM.yyyy"/>
            </fixedWidthColumn>
            <fixedWidthColumn width="25" skip="3" type="DATETIME" name="cTime"
                              ignore="false" fillChar="-">
                <dataFormatParameters dateTimeFormat="dd-MM-yyyy HH:mm"
                                      dateFormatLocale="en_US"/>
            </fixedWidthColumn>
            <fixedWidthColumn width="20" skip="3" type="STRING" name="cText"
                              ignore="false" fillChar="*"/>
        </columns>
        <dataFormatParameters
            falseValue="false"
            dateTimeFormat="yyyy-MM-dd HH:mm:ss"
            trueValue="true"
            decimalSeparator="."
            dateFormatLocale="en_US"
            dayFormat="yyyy-MM-dd"
            thousandsSeparator=""
          />
        <errorHandlingStrategy rejectFileName="someRejectFile.txt">
            <errorInstructions>
                <errorInstruction putToReject="false" errorType="UNPARSABLE_FIELD"
                                  putToLog="true" dataStrategy="READ_POSSIBLE"/>
                <errorInstruction putToReject="false" errorType="SHORT_LINE"
                                  putToLog="false" dataStrategy="READ_POSSIBLE"/>
                <errorInstruction putToReject="true" errorType="PROCESSING_ERROR"
                                  putToLog="true" dataStrategy="STOP"/>
                <errorInstruction putToReject="false" errorType="INVALID_DATE"
                                  putToLog="true" dataStrategy="READ_POSSIBLE"/>
            </errorInstructions>
        </errorHandlingStrategy>
        <shadowColumns/>
     </properties>
</step>

iWay Software