NME Batch Interface

In this section:

A batch.xml file is one of the main configuration files for the iWay MDS NME engine and defines data load and export operations and their modes. It is stored in [iWay MDS project]/etc/nme-batch.gen.xml and referenced in [iWay MDS project]/etc/nme-config.xml. For more information, see NME Configuration.

For input interface, you can define the following modes:

Similarly, for output interface, you can define the following modes:

Once the load operation is defined, iWay MDS always does a Delta detection process when loading data in, to compare all incoming data (columns), with the data (columns) already stored in the database, and then it processes only those which differ. You can omit some columns from the comparison using the Ignore column functionality in the load operation definition.

There are also some special operations designed to reprocess data already stored in the iWay MDS database without having input data.

Note: When several load operations are started sequentially, the first is launched and the rest is enqueued. The same principle applies to both export and reprocess operations (a reprocess operation is a type of load operation, so it is enqueued together with load operations).


Top of page

x
Input Interfaces

Input interfaces consist of two elements:



x
Upsert Operation

The simplest batch load operation that reads the data from the source and upserts all the records it finds. It can be used across multiple systems and is especially useful for performing initial loads.

<op name="direct_load"
class="com.ataccama.nme.engine.batch.DirectLoadBatchOperation">
	<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />
</op>

Note: The upsert operation is capable of loading already deleted data, which it stores with eng_active = false. These deleted instances must have change_type = D when being loaded.



x
Full Load Operation

Performs a full load of selected entities from a defined system. Entities not mentioned in the definition will not be imported, even though the corresponding IntegrationOutput is present in the plan. Appearance of records from other systems will cause the plan to fail.

<op name="full_load_s1" class="com.ataccama.nme.engine.batch.FullLoadBatchOperation">
	<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />
	<sourceSystem>s1</sourceSystem>
	<importedEntities>
		<entity name="Party" />
		<entity name="Address" />
	</importedEntities>
</op>

You can either specify the list of imported entities in the <importedEntities> element or choose all entities by specifying <importAllEntities>true</importAllEntities>.



x
Delta Load Operation

The delta load definition consists of a data source and several entity descriptors that define how each imported entity should be dealt with. Entities without descriptors will not be imported regardless if the corresponding IntegrationOutput is present in the plan or not.

<op name="delta_load"
 class="com.ataccama.nme.engine.batch.DeltaLoadBatchOperation">
<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />
	<descriptors>
		<descriptor class="com.ataccama.nme.engine.batch.load.AutonomousEntity"
 name="GenderLookup"/>
		<descriptor class="com.ataccama.nme.engine.batch.load.CentralEntity"
 name="Party" keyColumn="source_id"/>
		<descriptor class="com.ataccama.nme.engine.batch.load.CentralEntity"
 name="Contract" keyColumn="source_id"/>
		<descriptor class="com.ataccama.nme.engine.batch.load.DependentEntity"
 name="AkaName" keyColumn="party_source_id" centralEntity="Party"/>
	<descriptor class="com.ataccama.nme.engine.batch.load.PartitionedDependentEntity"
name="Address" >
			<partitions>
				<partition originId="sys1#PartyAddress" keyColumn="party_source_id"
centralEntity="Party" />
				<partition originId="sys1#ContractAddress" keyColumn="contract_source_id"
centralEntity="Contract" />
			</partitions>
		</descriptor>
	</descriptors>
</op>

AutonomousEntity

The entity is processed independently from the rest of the entities where the change is determined by input column change_type (I,U,D). The data is matched with the hub using the (origin, source_id) combination. The only parameter is name, the name of the entity from model.xml.

CentralEntity

Similar to AutonomousEntity, however, it also defines change processing for the corresponding DependentEntity descriptors. The change is determined by the change_type column, and the data is matched with the hub using the (origin, source_id) combination. The corresponding dependent entities are identified by the value of the column whose name is given in keyColumn. The value must be unique for all records in AutonomousEntity. Usually, it will be source_id.

DependentEntity

An entity depending directly on a CentralEntity. For example, it has no change indication on the input, and if a central record entity appears in the load, all its dependent records will be there as well. The change is determined by the change_type on the corresponding central entity and locating all the dependent records in the hub. The two lists are compared using source_id, and the changes are processed accordingly.

The central entity record is obtained using the value of the keyColumn column in the entity given in centralEntity. An error will occur (and load will fail) if the load contains a dependent entity record without its central entity record.

A good example of the dependent entity is a subelement in an XML file. For example, the top element of the XML file holds the client data and has the change_type column. A subelement like address or contact is dependent because all occurrences of mentioned entities are involved in the incoming data.

PartitionedDependentEntity

Similar to the dependent entity, but different centralEntity and keyColumn attributes can be prescribed for each origin.


Top of page

x
Delta Detection Process

By default, all attributes with SOURCE origin are used for delta detection. Some of them can be removed from the comparison process using the following construct (applies to all load operations). Note the ignoredComparisonColumns element.

<op name="full_load_s1" class="com.ataccama.nme.engine.batch.FullLoadBatchOperation">
	<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />
	<sourceSystem>s1</sourceSystem>
	<importAllEntities>true</importAllEntities>
	<ignoredComparisonColumns>
		<column names="src_system_change_date" entities="*" />
		<column names="src_aud_*" entities="Address,Contact" />
	</ignoredComparisonColumns>
</op>

The <code>names</code> and <code>entities</code> attributes are comma-separated lists of names and name patterns (with * wildcard for matching any number of characters).


Top of page

x
Deletion Strategy

There are two strategies available for a load operation to handle deletes. The strategy is configured by the sourceDeletionStrategy attribute:

<op name="direct_load"
 class="com.ataccama.nme.engine.batch.DirectLoadBatchOperation"
sourceDeletionStrategy="deactivate">
	<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />
</op>

Top of page

x
Data Providers

The PlanBatchDataSource provider executes an iWay DQS plan and maps its Integration Output steps to entity names. For example, the Integration Output step named party receives records and provides them to further processing as records of the party entity.

<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/load.plan" />

Top of page

x
Output Interfaces

Export operations are used to export data from the MDM hub.

<op name="batch_export" class="com.ataccama.nme.engine.batch.StandardBatchExportOperation">
<dataProviders>
	<dataProvider prefix="inst_" class="com.ataccama.nme.engine.batch.InstanceLayerDataSource" 
scope="ACTIVE"/>
	<dataProvider viewName="dst" prefix="dst_"
class="com.ataccama.nme.engine.batch.MasterLayerDataSource"
scope="EXISTING"/>
	<dataProvider prefix="instInc_"
class="com.ataccama.nme.engine.batch.InstanceLayerDeltaDataSource"
scope="EXISTING"/>
</dataProviders>
	<publisher class="com.ataccama.nme.dqc.batch.PlanBatchExportPublisher"
planFileName="../batch/batch_export.plan">
		<pathVariables>
			<pv name="BATCH_OUT" value="/mds/data/out"/>
		</pathVariables>
	</publisher>
</op>

where:

MasterLayerDataSource

Is for master data.

MasterLayerDeltaDataSource

Is for incremental master data.

InstanceLayerDataSource

Is for instance data.

InstanceLayerDeltaDataSource

Is for incremental instance data.

scope

Is either ACTIVE | EXISTING (default is EXISTING).

prefix

Is the prefix of integration inputs using this provider.

viewName

Is the name of the master layer from which to provide data (master only).

sourceSystem

If defined, only data from the specified source system will be exported (instance only, optional).

Output Interfaces With Conditions

There are four more data providers that allows advanced filtering:

These data providers exports records of only one entity, but you can also configure record filtering. Apart from parameters defined above:

The following table shows the possible values for the parameter operator:

Value

Meaning

EQ

=

GT

>

LT

<

GE

>=

LE

<=

NE

<>

LIKE

LIKE

The LIKE operator syntax uses wildcard * (meaning any character(s)). For example:

<dataProvider prefix="inst_"
 class="com.ataccama.nme.engine.batch.export.MasterEntityDataSource"
scope="ACTIVE" entity="party" viewName="mas">
	<conditions>
		<condition column="cio_gender" value="M" operator="EQ" />
		<condition column="score_instance" value="1000" operator="LT" />
		<condition column="cio_first_name" value="John*" operator="LIKE" />
	<conditions>
</dataProvider>


x
Special Interfaces

This section describes the different special interface operations.



x
Reprocess Operation

The reprocess operation is a special type of load operation. Its purpose is to reprocess selected records stored in the iWay MDS storage. Selected records are read from iWay MDS storage and sent to the MD Consolidation process.

Records of various origins (different entities and records from different source systems) can be reprocessed together in one load operation.

<op name="reprocess" class="com.ataccama.nme.engine.batch.ReprocessBatchOperation">
		<dataProvider class="com.ataccama.nme.dqc.batch.PlanBatchDataSource"
planFileName="../batch/reprocess.plan" />
		<recordIdentification>SOURCE_ID</recordIdentification>
</op>

where:

<dataProvider>

Is the same as in other load operations.

<recordIdentification>

Is SOURCE_ID (default) or ID, and specifies how the records are identified, whether the (origin, source_id) pair or the ID column is used.

The Reprocess Operation reads the record identifiers from the correctly named Integration Output(s) and reprocesses the corresponding records that have been found in the iWay MDS storage.



x
Full Reprocess Operation

This is similar to the Reprocess operation, but processes all records of the selected entities. This is useful during migrations. There is no data provider nor plan file for this operation.

<op name="full_reprocess" class="com.ataccama.nme.engine.batch.FullReprocessBatchOperation">
	<processAllEntities>false</processAllEntities>
	<processedEntities>
		<entity name="Party" />
	</processedEntities>
</op>

where:

<processAllEntities>

If true, all instance layer entities will be reprocessed.

<processedEntities>

Is the list of selected entities if processAllEntities is false.

In this operation, all data from the selected entities are read, so there is matching between data in the iWay MDS storage and input data. Similar to the Reprocess Operation, unmodified records are reported as, nnn record(s) ignored, during the Committing phase.



x
Task Executor Status Report

Task Execution Status is internally stored CLOB. To export it, use the following special operation.

<op name="task_info"
class="com.ataccama.nme.engine.batch.StandardBatchExportOperation">
	<dataProviders>
		<dataProvider class="com.ataccama.nme.engine.batch.TaskInfoDataSource"/>
	</dataProviders>
	<publisher class="com.ataccama.nme.dqc.batch.PlanBatchExportPublisher"
planFileName="../export/task_info.comp"/>
</op>

iWay Software