Parallel Control Services

In this section:

You can run multiple operations in parallel within a flow. For example, you can query multiple data stores simultaneously, or update multiple databases. This can be accomplished with parallel edges (multiple matching edges coming out of a node), though it is difficult to design a flow that controls the concurrent execution and the orderly termination of these operations.

The parallel control agent solves these issues by controlling the execution of an arbitrary number of identical parallel operations. It will then proceed down its flow on its output edge only when all of the parallel operations are complete. All threading control is managed by the agent, so that the application flow designer need not consider any synchronization issues.

The number of operations is determined by an iterator embedded inside the agent. The iterator is initialized with the input document reaching the agent. The agent uses each item returned by the iterator as the input document of a subflow it runs in parallel. When all subflows have returned, the agent itself terminates. The output is the accumulated result of all the subflows.

Each subflow is independent. For example, the subflow can itself contain iterators or other objects that maintain state without interfering with other parallel flows. Subflows should be designed to be relatively quick to complete, as the parallel control agent cannot proceed until all subflows either return or are timed out.

There are multiple variants of the parallel control agent depending on the type of iterator it embeds. For example, the XDParallelXMLSplitAgent embeds an XDIterXMLSplit iterator. The XDParallelCountAgent embeds an XDIterCount iterator.

The following table shows parallel services with common parameters.

Parameter

Description

Flow Name

Name of the published subflow to run in parallel. This must have been published.

Maximum Parallel Tasks

Used to throttle the execution to limit the amount of resources needed. It specifies the maximum number of subflows to process simultaneously. If an additional subflow must be executed, it will wait for a previous subflow to complete. The value 0 means unbounded, therefore all items returned by the iterator will attempt to run in parallel.

Timeout

Specifies the maximum time to wait for all subflows to complete. The agent timeout should probably be bigger than the subflow timeout if throttling is used. The value 0 means there is no time limit.

Accumulation Version

The parallelized result is accumulated. The result is either Simple or Multiple. Simple means that only a single, XML document is accumulated from each parallel subflow. Multiple allows several documents of XML or non-XML to be accumulated from each parallel subflow. Simple uses less memory. Multiple allows a wider variety of parallelized flows.

The remaining parameters are specific to the embedded iterator.

The agent assigns a unique integer value to the special register iway.taskindex for each subflow that it executes. The value of taskIndex for the first subflow is 1, the value for the second subflow is 2 and so on, increasing by one.

The following edges are returned by the agent in this situation.

Edges

Description

success

Every subflow was completed successfully, or the iterator returned no items.

fail_parse

Syntax error in one of the parameter expressions.

fail_notfound

Cannot find a subflow with that name.

fail_timeout

The agent timed out or was cancelled.

fail_operation

All of the subflows returned abnormally.

fail_incomplete

At least one subflow completed successfully and at least one returned abnormally. Users are warned that fail_incomplete, if not wired, will be reflected as an error. Some flows may wish to wire this and the success edges to the same node, and do analysis later in the flow. Fail_incomplete is not a reflection of decisions made in the execution of the item, but rather of the success of execution itself.

cancel

The overall flow was cancelled.

Notice that if a subflow times out, it will appear to return abnormally. This will cause the agent to return fail_operation or fail_incomplete depending on the result of other process flows. The agent returns the edge fail_timeout only if the agent itself times out.

For the Simple Accumulation version, the result has a format as shown in the following example.

Note: The XML output shown is for the <Test> element with a value of _sreg('iway.taskindex').

<?xml version="1.0" encoding="UTF-8" ?>
<results>
  <result index="1" items="1" edge="success">>
      <Test>1</Test>
  </result>
  <result index="3" items="1" edge="success">>
      <Test>3</Test>
  </result>
  <result index="4" items="1" edge="success">
      <Test>4</Test>
  </result>
  <result index="2" items="1" edge="backout" error='true'>
      <status>This is an error document from the subflow</status>
  </result>
</results>

The final result for Multiple Accumulation Version has the following form.

Note: The XML output shown is for the <Test> element with a value of _sreg('iway.taskindex').

<?xml version="1.0" encoding="UTF-8" ?>
<results>
  <result index="1" items="1">
    <item edge="success" encoding="UTF-8" format="xml">
      <Test>1</Test>
    </item>
  </result>
  <result index="3" items="1">
    <item edge="success" encoding="UTF-8" format="string">value of string
    </item>
  </result>
  <result index="4" items="1">
    <item edge="success" encoding="UTF-8" format="bytes">value in base64
    </item>
  </result>
  <result index="2" items="1">
    <item edge="backout" encoding="UTF-8" format="xml" error='true'>
      <status>This is an error document from the subflow</status>
    </item>
  </result>
</results>

In this example, the subflows terminated in the end nodes named success and backout. The subflow with index 2 returned an error, and the error status of the document is noted for analysis. The subflows also returned different types of information. The application designer is responsible for the data and the format in which it is returned.

The following table shows informal attributes in the output accumulation.

Attribute

Description

encoding

The IANA encoding associated with the document. (Multiple format only)

format

The payload type. See below for information on the types. (Multiple format only)

edge

The name of the end node in the subflow that returned this document.

error

If the returned document has the error state set, this attribute will be set to true.

The following table shows the formats and how data values are returned in the output accumulation.

Format

Description

xml

The returned information is carried as a child (subtree) of the <item> node. In our example, it is a small XML document rooted in the <Test> element.

string

The information is returned as the value of the <item> node.

bytes

The information is returned as the value of the <item> node. The information is base64 encoded.

It is legal if the iterator returns no item. In this case, no subflow will be executed and the <results> element will be empty.

Each <result> element contains the result of one subflow. The index attribute specifies the value of the taskIndex special register for that subflow. The <result> elements are not sorted since they appear in the order that subflows terminate. The <result> element contains the one or more documents returned by the subflow. The agent ignores documents that do not contain information. If the subflow ended abnormally, an error document will appear instead.

Once the parallel agent completes, the resulting accumulation is available to subsequent flow nodes. These nodes can deal with the result in any manner appropriate to the application. For example, you can execute an XDIterXMLSplit iteration and process each result as needed. In this case, the agent did not remove the need for iteration. However, it executed a potentially lengthy computation in parallel, saving a lot of time. Running iteration on a pre-computed result is much faster.

You can also execute a transform to produce an appropriate output document.

As usual with subflows, it is also possible to assign special registers in the parent scope, enabling to pass information back to the flow of the agent if needed. The use of a register lock is advisable when the Maximum Parallel Tasks property is more than 1. The scope and Lock Name properties of the XDSREGAgent are designed for this.


Top of page

x
Cancellation Concerns

If the outer flow is cancelled, the cancellation is passed to the subflows, and no further subflows are started. The parallel control agent awaits completion of the current set, and then passes the status document down the cancel edge.


Top of page

x
Developer Cautions

Developers using the parallel control agent are reminded that several issues must be considered in use and tuning of the application.

  1. Parallel operations are passed among the available execution threads of the computer. On a computer with a large number of processors, this can result in significantly improved performance. However, on a computer with a few processors, control may be passed back and forth among the threads and little if any overall performance gain is achieved. It is easy to generate more computation/access requests than the system can satisfy.
  2. The best use of parallel processing is when passing work off the system to another system, such as a database or another computer through a remote call. In this case, most of the subflows will be waiting for results and many can be managed simultaneously.
  3. Memory use is cumulative. Since each subflow receives a copy of the input document, and each needs memory to do its work, memory considerations should be taken into account in design of the application.

The services support a Maximum Parallel Tasks parameter. This is the number of executions that will take place in parallel. Managing use of iFL, coupled with an understanding of the dynamics of the application, can help in setting this parameter. For example, the following command sets the number of attempted simultaneous parallel executions to a multiple of the available processors.

                 _int(_ceil(_mul(_div(_sysinfo(processors),_chaninfo(*,'active')),8)))

Top of page

x
XDParallelCountAgent Properties

The following image shows the configuration parameters for Parallel Execution: Counted iteration service.


Top of page

x
XDParallelXMLSplitAgent Properties

The following image shows the configuration parameters for Parallel Execution: XML iteration service.

The following image shows the configuration parameters for Parallel Execution: XML iteration service.


iWay Software