In this section: |
The nme-model.xml file is one of the main configuration files for the NME engine. It is stored in [iWay MDS project]/etc/nme-model.gen.xml and referenced in [iWay MDS project]/etc/nme-config.xml.
The main purpose is to define the following:
The following syntax shows the basic structure of the nme-model.xml file.
<?xml version='1.0' encoding='UTF-8'?> <model> <instanceLayer> <entities> </entities> <sourceSystems> </sourceSystems> </instanceLayer> <masterLayers> <masterLayer name="Master"> <entities> .. </entities> </masterLayer> <masterLayer name="..."> .. </masterLayer> </masterLayers> <eventHandlersConfigFile>nme- event_handler.gen.xml</eventHandlersConfigFile> </model>
It is possible to define one instance layer, including entities and source system definitions, and more master layers and their entities.
This section describes the entities, relationships, columns, and operations of the model definitions.
There are two types of entities:
<entities> <entity name="party"> <columns> .. </columns> <relationships/> <cleansingOperation> .. <cleansingOperation> <matchingOperation> .. </matchingOperation> </entity> .. </entity> </entities>
Parameters:
<entities> <entity instanceEntity="party" name="party" class="com.ataccama.nme.engine.model.PersistentMasterEntity" groupingColumn="master_id"> <columns> .. </columns> <relationships/> <mergingOperation> .. </mergingOperation> </entity> <entity instanceEntity="party" name="party_instance" class="com.ataccama.nme.engine.model.VirtualInstanceEntity"> <relationships> .. </relationships> </entity> </entities>
where:
Defines a physical table to be stored in a database.
Defines a virtual entity, which is required to create master to instance relationship.
Is the name of the entity.
Is the corresponding entity on the instance layer.
Specifies the grouping column used to create a golden record. Column has to be of long_int data type (only for PersistentMasterEntity). Default column name = master_id.
Usage of the PersistentMasterEntity:
There are two kinds of relationships:
<relationships> <rel reverseName="addresses" parentEntity="party" parentColumn="source_id" name="party" foreignKeyColumn="pty_source_id"> <type class="com.ataccama.nme.engine.model.SameSystemRelationship"/> </rel> </relationships>
where:
Is the relationship name. It can be used as a reference name for copy columns.
Is the reverse relationship name. It can be used as a reference name for copy columns.
Is the name of the parent entity.
Is the foreign key column of the child entity.
Defines the type of relationship.
Is the relationship created only between IDs having the same origin. For example, within a single system. If it is not defined, sameSystemRelationship is used by default.
Enables you to create a relationship between two entities regardless of the origin. Useful for code books and dictionaries.
This is the same definition as the instance relationship, except <type> is not defined because there is no longer a source system reference. Therefore, the behavior is the same as described for sameSystemRelationship above.
<relationships> <rel reverseName="addresses" parentEntity="party" parentColumn="id" name="party" foreignKeyColumn="pty_id"> <type class="com.ataccama.nme.engine.model.CrossSystemRelationship"/> </rel> </relationships>
The following syntax shows the column definition per each entity.
-- instance columns definition <columns> <column name="src_name" origin="source" type="string" size="100"/> <column name="std_name" origin="clean" type="string" size="100"/> <column name="src_ssn" origin="source" type="string" size="100"/> <column name="std_ssn" origin="clean" type="string" size="100"/><column name="uni_can_id" origin="match" type="long" size="20"/> </columns> -- master columns definition <columns> <column name="cmo_first_name" type="string" size="100"/> <column name="cmo_last_name" type="string" size="100"/> </columns>
where:
Is the column name.
Defines a place in a plan, where the column appears (column definition context - instance columns only).
Is the source value and appears in the cleanse plan input.
Is cleansed and appears as a cleanse plan output.
Is the match value and appears as a match plan output.
Is the DQS data type.
Is the column length (used as a DB column size).
Copy Columns
The copy column functionality is necessary for copying an attribute from one entity to another. To define a copy column, a relationship definition must exist between the entities. After it is defined, you can define the direction of copy columns: parent to child or child to parent.
<column name="mrg_addresses" origin="COPY_CLEAN" type="STRING" size="100"> <valueSource relationshipName="addresses" columnName="std_value"> <aggregation class="com.ataccama.nme.engine.model.ConcatenateDistinct" separator=";"maxLength="100"/> <filterExpression>source.std_address_type = "R"</filterExpression> </valueSource> </column>
where:
Is the name of the column.
Is the type of the column.
Is the size of the column.
In the copy column context, it defines which columns are taken, and the place where the copied columns appear.
Is the source column added to a cleansing plan output.
Is the cleansed result column added to a match plan input.
Is the matched result column added to a merge plan input.
Contains the columnName or the relationshipName.
Is the source column for the copy.
Is the relationship definition and direction to be used for the copy.
Contains the different values and concatenations.
Is used to perform child to parent copy column. Because of the 1:M relationship, there is a concatenate distinct functionality to store all the values related to a single attribute on the parent entity.
Is the values separator definition for concatenation.
Is the length of concatenated outcome.
Is used to perform child to parent copy columns. It takes minimum value from all related entities (because of 1:M relationship).
Is used to perform child to parent copy columns. It takes maximum value from all related entities (because of 1:M relationship).
Is used to perform parent to child copy columns. It can be used for child to parent copy columns, but there is no deterministic selection.
Is (boolean) optional. It provides an ability determine when column copying should be performed using a filter expression. There are two pseudo inputs available: source (origin of copy) and target to construct the expression.
History Collector
Enables you to keep historical values of a given column and stores concatenated values to a newly defined column. Such functionality can be useful in a match process.
<column name="std_uir_adr_id_h" origin="CLEAN" type="STRING" size="40000"> <historyCollector columnName="std_uir_adr_id" separator="^~" maxCount="20"/> </column>
where:
Is the source column.
Is the values separator definition for concatenation.
Is the number of values kept in history.
OldValueProvider
OldValueProvider is a way to access old value of the instanceEntity column in the transition plan.
<column name="std_expiration_date" origin="clean" type="datetime" size="100"> <oldValueProvider columnName="std_expiration_date_old" /> </column>
where:
Defines target column with oldValue, which is available in transition plans.
The following table shows how oldValueProvider can be defined for columns with their origins.
Origin of Column | oldValue Column Availability in Transition Plan |
---|---|
SOURCE | clean, match, merge |
COPY_SOURCE | clean, match, merge |
CLEAN | clean, match, merge |
COPY_CLEAN | match, merge |
MATCH | match, merge |
COPY_MATCH | merge |
Note: Do not use oldValue based expressions after the Extended Unification step in the match plan. Some additional records can be expanded from the unification repository during the match process, and the result will not be exact. Instead, move this computation to the merge plan, where all the relevant data and their oldValues are available.
This section describes the operations involved for the NME configuration.
Cleanse Operation
Defines a cleanse logic for a given entity, which is represented by an iWay DQS plan.
<cleansingOperation class="com.ataccama.nme.dqc.operations.CleansingPlan" planFileName="../trans/party/party_clean.comp"/>
Match Operation
Defines a match logic for a given entity, which is represented by an iWay DQS plan.
<matchingOperation enableIdentify="false" class="com.ataccama.nme.dqc.operations.MatchingPlan" planFileName="../trans/party/party_match.comp"/>
where:
Is boolean and defines whether the match plan is used to perform identify services or not. It is also the performance optimization (default is true).
Custom Identify Plan Definition
The NME engine allows you to define a special matching plan to perform the identifyService using the identifyPlanFileName attribute.
<matchingOperation enableIdentify="false" class="com.ataccama.nme.dqc.operations.MatchingPlan" planFileName="../trans/party/party_match.comp" identifyPlanFileName="../trans/party/identify_party_match.comp"/>
Merge Operation
The operation defines a link between NME plan representing a merge operation and master entity definition.
<mergingOperation customActivityTracking="false" class="com.ataccama.nme.dqc.operations.MergingPlan" planFileName="../trans/party/Master_party_merge.comp"/> <recordFilterExpression>eng_active=true</recordFilterExpression> </mergingOperation>
where:
Is boolean and uses custom rules for activity calculation, for example, overwriting the eng_active value in the merge plan. The setting is false by default.
Is boolean and optional, and enables you to apply a filter to the merge plan record input.
Defines a list of source systems, which are connected to the NME engine.
<sourceSystems> <system name="crm"> <instanceMappings> <originId name="crm_customer#Party" entity="party"/> <originId name="crm_address#Address" entity="address"/> </instanceMappings> </system> <system name="life"> <instanceMappings> <originId name="life_contract#Party" entity="party"/> <originId name="life_contract#Address" entity="address"/> </instanceMappings> </system> </sourceSystems>
where:
Is the source system name. It is available in the eng_source_system column in the DQS plans.
Is the descriptive name, which provides text information about the record origin. It is available as the eng_origin attribute in DQS plans.
Is the name of the instance entity.
The configuration file for event handlers is referenced here.
For more information, see NME Event Handler.
<eventHandlersConfigFile>nme-event_handler.gen.xml</eventHandlersConfigFile>
The following nme-model.xml configuration sample does not show all configuration possibilities.
<?xml version='1.0' encoding='UTF-8'?> <model> <instanceLayer> <entities> <entity name="party"> <columns> <column name="src_name" origin="source" type="string" size="100"/> <column name="std_name" origin="clean" type="string" size="100"/> <column name="exp_name" origin="clean" type="string" size="500"/> <column name="sco_name" origin="clean" type="integer" size="10"/> <column name="std_first_name" origin="clean"type="string" size="100"/> <column name="std_last_name" origin="clean" type="string"size="100"/> <column name="src_ssn" origin="source" type="string" size="100"/> <column name="std_ssn" origin="clean" type="string" size="100"/> <column name="exp_ssn" origin="clean" type="string" size="500"/> <column name="sco_ssn" origin="clean" type="integer" size="10"/> <column name="uni_can_id" origin="match" type="long" size="20"/> <column name="uni_rule_name" origin="match" type="string" size="100"/> <column name="uni_role" origin="match" type="string" size="10"/> <column name="add_std_zip" origin="copy_clean" type="string" size="1000"> <valueSource columnName="std_zip" relationshipName="pat_add"> <aggregation maxLength="1000" class="com.ataccama.nme.engine.model.ConcatenateDistinct" separator="~"/> </valueSource> </column> </columns> <relationships/>
<matchingOperation enableIdentify="false" class="com.ataccama.nme.dqc.operations.MatchingPlan" planFileName="../trans/party/party_match.comp"/> <cleansingOperation class="com.ataccama.nme.dqc.operations.CleansingPlan" planFileName="../trans/party/party_clean.comp"/> </entity> <entity name="address"> <columns> <column name="pty_source_id" origin="source" type="string" size="100"/> <column name="src_street" origin="source" type="string" size="100"/> <column name="std_street" origin="clean" type="string" size="100"/> <column name="src_city" origin="source" type="string" size="100"/> <column name="std_city" origin="clean" type="string" size="100"/> <column name="src_state" origin="source" type="string" size="100"/> <column name="std_state" origin="clean" type="string" size="100"/> <column name="src_zip" origin="source" type="string" size="100"/> <column name="std_zip" origin="clean" type="string" size="100"/> <column name="exp_address" origin="clean" type="string" size="500"/> <column name="sco_address" origin="clean" type="integer" size="10"/> <column name="pty_master_id" origin="copy_clean" type="long_int" size="30"> <valueSource columnName="master_id" relationshipName="party"> <aggregation class="com.ataccama.nme.engine.model.FirstValue"/> </valueSource> </column> </columns> <relationships> <rel reverseName="addresses"parentEntity="party" parentColumn="source_id" name="party" foreignKeyColumn="pty_source_id"> <type class="com.ataccama.nme.engine.model.SameSystemRelationship"/> </rel> </relationships><matchingOperation enableIdentify="false" class="com.ataccama.nme.dqc.operations.MatchingPlan" planFileName="../trans/address/address_match.comp"/> <cleansingOperation class="com.ataccama.nme.dqc.operations.CleansingPlan" planFileName="../trans/address/address_clean.comp"/> </entity> </entities> <masterLayers> <masterLayer name="Master"> <entities> <entity instanceEntity="party" name="party" class="com.ataccama.nme.engine.model.PersistentMasterEntity" groupingColumn="master_id" > <columns> <column name="cmo_first_name" type="string" size="100"/> <column name="cmo_last_name" type="string" size="100"/> <column name="cmo_id_card" type="string" size="30"/> <column name="cmo_ssn" type="string" size="30"/> </columns> <relationships/> <mergingOperation customActivityTracking="false" class="com.ataccama.nme.dqc.operations.MergingPlan" planFileName="../trans/party/Master_party_merge.comp"/> <recordFilterExpression>eng_active=true</recordFilterExpression> </mergingOperation> </entity> <entity instanceEntity="address" name="address" class="com.ataccama.nme.engine.model.PersistentMasterEntity" groupingColumn="master_id">
<sourceSystems> <system name="crm"> <instanceMappings> <originId name="crm_customer#Party" entity="party" code="10"/> <originId name="crm_address#Address" entity="address" code="11"/> </instanceMappings> </system> <system name="life"> <instanceMappings> <originId name="life_contract#Party" entity="party" code="20"/> <originId name="life_contract#Address" entity="address" code="21"/> </instanceMappings> </system> </sourceSystems> </instanceLayer>
<columns> <column name="pty_id" type="long_int" size="30"/> <column name="cmo_street" type="string" size="100"/> <column name="cmo_city" type="string" size="100"/> <column name="cmo_state" type="string" size="100"/> <column name="cmo_zip" type="string" size="100"/> </columns> <relationships> <rel reverseName="addresses" parentEntity="party" parentColumn="id" name="party" foreignKeyColumn="pty_id"/> </relationships>
<mergingOperation customActivityTracking="false" class="com.ataccama.nme.dqc.operations.MergingPlan" planFileName="../trans/party/Master_address_merge.comp"/> <recordFilterExpression>eng_active=true</recordFilterExpression> </mergingOperation> </entity> <entity instanceEntity="party" name="party_instance" class="com.ataccama.nme.engine.model.VirtualInstanceEntity"> <relationships> <rel parentEntity="party" parentColumn="id" name="fk_party_inst" foreignKeyColumn="master_id"/> </relationships> </entity> </entities> </masterLayer> </masterLayers> <eventHandlersConfigFile>nme- event_handler.gen.xml</eventHandlersConfigFile> </model>
iWay Software |