Detailed Description of Validate In Res

Tests the existence of the the input IO (property ico) in the Czech Register of Economic Subjects (RES) by obtaining the business name. The obtained business name is compared with the input business name (property businessName). The obtained name is stored in the property realNameOut if this property is defined. A dictionary file which contains all economic subjects is defined by the property databaseFile(for a detailed description of the dictionary file see the section Files - Register of Economic Subjects.

If the input IO is not filled in, then DQC reports an error and the step ends. If the input IO is filled in, then the step reads the input business name and compares it to the business name found in the dictionary file by the IO value. The Create Matching Value step is applied to both the input and dictionary file names and then the values are compared. Configuration of this "internal" matching generator is realized via the matchingValueGeneratorConfig property. If the values differ, then the scoring flag IC_EXT_NOT_FOUND is set. If there is no record matching the input IO value in the dictionary file, then the scoring flag IC_EXT_NOT_FOUND is set.

In order for both the input and dictionary IO strings to match, the same transformation rules must be applied to the input IO as those applied to the IO values in the dictionary. (In case of the RES dictionary file the applied steps are in turn TransformLegal Forms step and Matching Value. At least, the step Transform Legal Forms must be applied before data enter the step. Matching Value may be executed via an internal property.)Matching Value is configured during dictionary generation in such a way that besides duplicate character removal in business names (doRemoveRepeatedChars is false) all transformations - doRemoveDia, doSqueezeWS, doUpperCase, doRemoveSpecialChars - are executed, i.e., the input string must be uppercase, without diacritics, without special characters and without multiple spaces. An example of the input which fulfills the format requirements to match the dictionary value is shown in the example below.


Top of page

Example: Example

The following example illustrates a typical step configuration involving an input business name transformation according to the rules used in RES (Transform Legal Forms, Create Matching Value).

<step id='alg' className='cz.adastra.cif.tasks.clean.TransformLegalFormsAlgorithm'>
        <properties>
                <in>company</in>
                <out>matching_company</out>
                <legalFormsLookupFileName>c:\tempdata\configs\validateinres\legal_forms_cz.cif</legalFormsLookupFileName>
                <scorer>
                        <scoringEntries>
                                <scoringEntry key='TLF_NULL' score='100' explain='true' />
                                <scoringEntry key='TLF_CHANGED' score='100' explanationColumn='expl_tlf'/>
                        </scoringEntries>
                </scorer>
        </properties>
</step>
<connection className='cz.adastra.cif.model.elements.connections.StandardFlowConnection'>
        <source step='legalForms' endpoint='out'/>
        <target step='matchingValue' endpoint='in'/>
</connection>
<step id='matchingValue' className='cz.adastra.cif.tasks.clean.CreateMatchingValueAlgorithm'>
        <properties>
                <in>matching_company</in>
                <out>matching_company</out>
                <config
                        doRemoveDia='true'
                        doRemoveRepeatedChars='false'
                        doSqueezeWS='true'
                        doUpperCase='true'
                        doRemoveSpecialChars='true'
                />
                <scorer explanationColumn='expl_mva'>
                        <scoringEntries>
                                <scoringEntry key='CM_CHANGED' score='10' explain='true' />
                        </scoringEntries>
                </scorer>
        </properties>
</step>
<connection className='cz.adastra.cif.model.elements.connections.StandardFlowConnection'>
        <source step='matchingValue' endpoint='out'/>
        <target step='alg' endpoint='in'/>
</connection>
<step id='alg' className='cz.adastra.cif.tasks.clean.ValidateInResAlgorithm'>
        <properties>
                <ico>ico</ico>
                <businessName>matching_company</businessName>
                <realNameOut>clean_company</realNameOut>
                <databaseFile>c:\tempdata\configs\validateinres\res_cz.cif</databaseFile>
                <matchingNameGeneratorConfig doRemoveDia='true' doRemoveRepeatedChars='true'
                                             doRemoveSpecialChars='true' doSqueezeWS='true' doUpperCase='true'/>
                <scorer explanationColumn='expl_vir'>
                        <scoringEntries>
                                <scoringEntry key='IC_EXT_NULL' score='100' explain='true'/>
                                <scoringEntry key='IC_EXT_NOT_FOUND' score='200' explain='true' />
                                <scoringEntry key='IC_EXT_NAME_MISMATCH' score='400' explain='true' />
                        </scoringEntries>
                </scorer>
        </properties>
</step>

iWay Software