Syntax:
com.ibi.agents.XDAvroFileReadAgent
Description:
This service reads an Avro container in binary format and returns the objects it contains converted to XML. The Avro data may come from the input document or a file.
Avro requires the presence of a schema. The Avro File Read service can use the schema always stored in the container, or it can specify a reader schema, in which case Avro will do its best to reconcile the two schemas. The effective schema is stored in the output document, so it can serve as a default for the Avro File Emit service.
The path to the Avro Schema or the Avro Data File can be a regular path in the file system, or a URL starting with hdfs://, which indicates the file is in the Hadoop file system. When the Hadoop file system is used, the parameters Hadoop Configuration and Default File System can be optionally specified. Otherwise, they are ignored.
Parameters:
The following table describes the parameters of the Avro File Read service.
Parameter | Description |
---|---|
Avro Schema | Path to the Avro Schema file. If absent, the schema stored with the data will be used. |
Input Source | Whether the Avro data is in the Input Document or in a File. |
Avro Data File | Path to the Avro data file. This is ignored if the Input Source is the Input Document. |
Hadoop Configuration | Path to the Hadoop configuration file, normally core-site.xml |
Default File System | In some Hadoop environments, this should be specified as the URI of the namenode, for example hdfs://[your namenode]. |
Edges:
The following table describes the edges that are returned by the Avro File Read service.
Edge | Description |
---|---|
success | The Avro data was successfully converted to XML. |
fail_parse | An iFL expression could not be evaluated. |
fail_notfound | A file path was specified but the file does not exist. |
fail_operation | The operation could not be completed successfully. |
Output Format:
The output document has the following format:
<av:avro xmlns:av="http://iwaysoftware.com/avro"> <av:item> ... </av:item> <av:item> ... </av:item> ... </av:avro>
The actual document is not indented. It is pretty-printed here for display purposes only.
The av:avro element represents the Avro container. Each av:item child element represents one Avro object in the container. The contents of the av:item varies depending on its type.
The following table describes how the various Avro types are converted to XML:
Avro Type | XML Representation |
---|---|
null | The element has an xsi:nil attribute set to true and no contents. For example: <av:item xsi:nil="true"/> |
boolean | The string true or false. For example: <av:item>true</av:item> |
int | A numeric string. For example: <av:item>123</av:item> |
long | A numeric string. For example: <av:item>123</av:item> |
float | A numeric string in a fixed point or scientific notation. For example: <av:item>12.34</av:item> |
double | A numeric string in fixed point or scientific notation. For example: <av:item>1.23E-12</av:item> |
string | The string. For example: <av:item>abc</av:item> |
enum | The symbol string. For example: <av:item>SPADES</av:item> |
bytes | A string of hexadecimal digits, each byte taking exactly two digits. For example: <av:item>040AFCFF</av:item> |
fixed | A fixed-length string of hexadecimal digits, each byte taking exactly two digits. For example: <av:item>040AFCFF</av:item> |
record | Each field becomes an unqualified sub-element with the same name as the field and no XML namespace. For example: <av:item> <name>John Smith</name> <address>123 Main Street</address> <city>New York</city> <state>NY</state> </av:item> |
array | Each item in the array becomes an av:item sub-element. For example: <av:item> <av:item>10</av:item> <av:item>42</av:item> <av:item>99</av:item> </av:item> The actual document is not indented. |
map | Each entry in the map becomes an av:entry sub-element with the key attribute set to the key, and the contents set to the entry value. For example: <av:item> <av:entry key="k1">val1</av:entry> <av:entry key="k2">val2</av:entry> <av:entry key="k3">val3</av:entry> </av:item> The actual document is not indented. |
union | The element has an xsi:type attribute set to the selected type and its contents is the union value directly as if the union did not exist. For example: <av:item xsi:type="int">123</av:item> The xsi:type attribute is omitted If the union has only two possible types, one of which is null. For example: <av:item>123</av:item> or else: <av:item xsi:nil="true"/> |
For more complex types, the rules are applied recursively. The name of the element representing a value is always chosen by the rules of its parent scope. The outermost element of an object is always av:item, then the sub-elements might be av:item, av:entry, or the name of a field in a record depending on the type.
Consider the following Avro complex type:
{"type": "record", "name": "Outer", "fields": [ {"name": "rec1", "type": {"type": "record", "name": "Inner", "fields": [ {"name": "f1", "type": "string"}, {"name": "f2", "type": "int"}]}}, {"name": "map1", "type": {"type": "map", "values": "string"}}, {"name": "array1", "type": {"type": "array", "items": "int"}}, {"name": "union1", "type": ["null", "string"]}, {"name": "union2", "type": ["null", "string"]}, {"name": "union3", "type": ["int", "string"]}]}
An instance of this record might look like the following syntax, once it is converted to XML (shown for display purposes only):
<av:avro xmlns:av="http://iwaysoftware.com/avro" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <av:item> <rec1> <f1>str1</f1> <f2>11</f2> </rec1> <map1> <av:entry key="k1">v1</av:entry> <av:entry key="k2">v2</av:entry> </map1> <array1> <av:item>10</av:item> <av:item>20</av:item> <av:item>30</av:item> </array1> <union1 xsi:nil="true"/> <union2>u2</union2> <union3 xsi:type="int">33</union3> </av:item> </av:avro>
iWay Software |