Avro File Read Service

Syntax:

com.ibi.agents.XDAvroFileReadAgent

Description:

This service reads an Avro container in binary format and returns the objects it contains converted to XML. The Avro data may come from the input document or a file.

Avro requires the presence of a schema. The Avro File Read service can use the schema always stored in the container, or it can specify a reader schema, in which case Avro will do its best to reconcile the two schemas. The effective schema is stored in the output document, so it can serve as a default for the Avro File Emit service.

The path to the Avro Schema or the Avro Data File can be a regular path in the file system, or a URL starting with hdfs://, which indicates the file is in the Hadoop file system. When the Hadoop file system is used, the parameters Hadoop Configuration and Default File System can be optionally specified. Otherwise, they are ignored.

Parameters:

The following table describes the parameters of the Avro File Read service.

Parameter

Description

Avro Schema

Path to the Avro Schema file. If absent, the schema stored with the data will be used.

Input Source

Whether the Avro data is in the Input Document or in a File.

Avro Data File

Path to the Avro data file. This is ignored if the Input Source is the Input Document.

Hadoop Configuration

Path to the Hadoop configuration file, normally core-site.xml

Default File System

In some Hadoop environments, this should be specified as the URI of the namenode, for example hdfs://[your namenode].

Edges:

The following table describes the edges that are returned by the Avro File Read service.

Edge

Description

success

The Avro data was successfully converted to XML.

fail_parse

An iFL expression could not be evaluated.

fail_notfound

A file path was specified but the file does not exist.

fail_operation

The operation could not be completed successfully.

Output Format:

The output document has the following format:

<av:avro xmlns:av="http://iwaysoftware.com/avro">
			<av:item>
						...
			</av:item>
			<av:item>
						...
			</av:item>
			...
</av:avro>

The actual document is not indented. It is pretty-printed here for display purposes only.

The av:avro element represents the Avro container. Each av:item child element represents one Avro object in the container. The contents of the av:item varies depending on its type.

The following table describes how the various Avro types are converted to XML:

Avro Type

XML Representation

null

The element has an xsi:nil attribute set to true and no contents. For example:

<av:item xsi:nil="true"/>

boolean

The string true or false. For example:

<av:item>true</av:item>

int

A numeric string. For example:

<av:item>123</av:item>

long

A numeric string. For example:

<av:item>123</av:item>

float

A numeric string in a fixed point or scientific notation. For example:

<av:item>12.34</av:item>

double

A numeric string in fixed point or scientific notation. For example:

<av:item>1.23E-12</av:item>

string

The string. For example:

<av:item>abc</av:item>

enum

The symbol string. For example:

<av:item>SPADES</av:item>

bytes

A string of hexadecimal digits, each byte taking exactly two digits. For example:

<av:item>040AFCFF</av:item>

fixed

A fixed-length string of hexadecimal digits, each byte taking exactly two digits. For example:

<av:item>040AFCFF</av:item>

record

Each field becomes an unqualified sub-element with the same name as the field and no XML namespace. For example:

<av:item>
    <name>John Smith</name>
    <address>123 Main Street</address>
    <city>New York</city>
    <state>NY</state>
</av:item>

array

Each item in the array becomes an av:item sub-element. For example:

<av:item>
    <av:item>10</av:item>
    <av:item>42</av:item>
    <av:item>99</av:item>
</av:item>

The actual document is not indented.

map

Each entry in the map becomes an av:entry sub-element with the key attribute set to the key, and the contents set to the entry value. For example:

<av:item>
    <av:entry key="k1">val1</av:entry>
    <av:entry key="k2">val2</av:entry>
    <av:entry key="k3">val3</av:entry>
</av:item>

The actual document is not indented.

union

The element has an xsi:type attribute set to the selected type and its contents is the union value directly as if the union did not exist. For example:

<av:item xsi:type="int">123</av:item>

The xsi:type attribute is omitted If the union has only two possible types, one of which is null. For example:

<av:item>123</av:item>

or else:

<av:item xsi:nil="true"/>

For more complex types, the rules are applied recursively. The name of the element representing a value is always chosen by the rules of its parent scope. The outermost element of an object is always av:item, then the sub-elements might be av:item, av:entry, or the name of a field in a record depending on the type.

Consider the following Avro complex type:

{"type": "record", "name": "Outer", "fields": [
  {"name": "rec1", "type": {"type": "record", "name": "Inner", "fields": [
    {"name": "f1", "type": "string"},
    {"name": "f2", "type": "int"}]}},
  {"name": "map1", "type": {"type": "map", "values": "string"}},
  {"name": "array1", "type": {"type": "array", "items": "int"}},
  {"name": "union1", "type": ["null", "string"]},
  {"name": "union2", "type": ["null", "string"]},
  {"name": "union3", "type": ["int", "string"]}]}

An instance of this record might look like the following syntax, once it is converted to XML (shown for display purposes only):

<av:avro xmlns:av="http://iwaysoftware.com/avro"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <av:item>
         <rec1>
             <f1>str1</f1>
             <f2>11</f2>
         </rec1>
         <map1>
             <av:entry key="k1">v1</av:entry>
             <av:entry key="k2">v2</av:entry>
         </map1>
         <array1>
             <av:item>10</av:item>
             <av:item>20</av:item>
             <av:item>30</av:item>
         </array1>
         <union1 xsi:nil="true"/>
         <union2>u2</union2>
         <union3 xsi:type="int">33</union3>
     </av:item>
</av:avro>

iWay Software