Detailed Description of RC Validator

This step verifies input birth numbers, recognizes and completes missing information. The implemented step depends on the provided input data and therefore its processing continues according to one of the following three scenarios:

Input/Output columns

The step identifies or verifies parts of the birth number. Expected input columns are the following:

Unfilled gender or birthDate columns are treated as empty values during verification.

Used output columns are the following:

The best case scenario is that all the output columns are filled in (input data came through the validation process successfully) and no scoring flag was set.

The value of the output birth number is influenced by the following parameters:

Detailed Description

Scenario A:

Birth number is read from the input column rc and cleansed (non digit characters are removed). Birth number is validated (according to the mod11 condition, date validity, etc.). Gender and birth date computed from the birth number are compared with the gender and birth date read from input.

The term "mod11 condition" denotes the following birth number validation method:

The input birth number is split into two parts - part A, which is formed by the first 9 digits of the birth number, and part n, which consists of the last digit of the birth number. Then a simple test which verifies the truth of the expression A modulo 11 = n is conducted. If A modulo 11 = 10 then 0 is considered to be the result of the modulo operation. The mod11 condition can be tested for 10 digit birth numbers only.

Example: Input birth number: 8151010043, A = 815101004, n = 3, A modulo 11 = 3, which implies that the sample birth number meets the condition. Input birth number: 8102060110, A = 810206011, n = 0, A modulo 11 = 10, i.e. result = 0, and the input birth number also meets the condition.

Gender and birth date values can be computed from the birth number only if the birth number contains a valid date, i.e. if none of the following scoring flags are set: FLAG_RC_MISSING, FLAG_RC_INVALID, FLAG_DATE_INVALID, FLAG_DATE_ARTIF. If the birth number contains an invalid date value and the input gender and birth date values are empty, the output gender and birth date values also remain empty (no values are generated from the invalid birth number).

Artificial birth numbers containing 0 as a day value or 50 as a month value are treated as invalid as well.

Birth number validation:

The following rules are required during birth number validation:

All birth numbers whose length is (after the non digit characters removal) 6,9 or 10 are acceptable. Further validation continues with respect to the length in the following scenarios:

When an invalid birth number is repaired by a trailer modification, appropriate scoring flags are set. The trailer modification is applied in the following cases:

If the input birthDate is present, the computed birth date is compared with the input birth date. If there is a difference between those two dates, the scoring flag RC_DATE_MISMATCH is set. Furthermore, if the computed birth date points to the future (no matter how the result was obtained), 100 years are subtracted from the computed date. In the case of different values of the computed and input birth date, the input value birthDate is written to the output. The computed and input gender values are compared similarly to the birth date. If there is a difference between those two gender values the scoring flag RC_GNDR_MISMATCH is set and the input gender value is written to the output.

Scenario B and C:

If the birth number is not present in the input, the implemented step attempts to generate a fake birth number. This operation requires valid input birth date and gender values to be provided. If any of these two pieces of information is missing, the scoring flag RC_NOT_GENERATED is set and input data are copied directly to the output. If the fake birth number has been successfully generated, the RC_GENERATED scoring flag is set.


Top of page

Example: Example
<step id='alg' className='cz.adastra.cif.tasks.clean.RCValidatorAlgorithm'>
        <binding name='rc' column='rc' />
        <binding name='rcOut' column='rc_out' />
        <binding name='birthDate' column='date' />
        <binding name='birthDateOut' column='date_out' />
        <binding name="gender" column="gender" />
        <binding name="genderOut" column="gender_out" />
        <properties>
                <canFix10digitBn>true</canFix10digitBn>
                <omitInvalidBn>true</omitInvalidBn>
                <maleDefinition>M</maleDefinition>
                <femaleDefinition>F</femaleDefinition>
                <dummyDate>1800-01-01</dummyDate>
                <birthNumberSince>1954-01-01</birthNumberSince>
                <allowArtificialTrailers>true</allowArtificialTrailers>
                <preserveInputValue>true</preserveInputValue>
                <suffix>1234</suffix>
                <trailers>
                        <trailer>888</trailer>
                        <trailer>8888</trailer>
                </trailers>
                <scorer explanationColumn='expl'>
                        <scoringEntries>
                                <scoringEntry key="RC_MISSING" score="1000" explain="true" />
                                <scoringEntry key="RC_INVALID" score="1000" explain="true" />
                                <scoringEntry key="RC_DATE_INVALID" score="1000" explain="true" />
                                <scoringEntry key="RC_DATE_ARTIF" score="1000" explain="true" />
                                <scoringEntry key="RC_DATE_MISMATCH" score="1000" explain="true" />
                                <scoringEntry key="RC_TRLR_FIXED" score="1000" explain="true" />
                                <scoringEntry key="RC_TRLR_MISSING" score="1000" explain="true" />
                                <scoringEntry key="RC_TRLR_ARTIF" score="1000" explain="true" />
                                <scoringEntry key="RC_9DIGITS" score="1000" explain="true" />
                                <scoringEntry key="RC_GNDR_MISMATCH" score="1000" explain="true" />
                                <scoringEntry key="RC_GNDR_FROM_HINT" score="1000" explain="true" />
                                <scoringEntry key="RC_GNDR_HINT_MISSING" score="1000" explain="true" />
                                <scoringEntry key="RC_GNDR_HINT_INVALID" score="1000" explain="true" />
                                <scoringEntry key="RC_DATE_HINT_MISSING" score="1000" explain="true" />
                                <scoringEntry key="RC_GENERATED" score="1000" explain="true" />
                                <scoringEntry key="RC_NOT_GENERATED" score="1000" explain="true" />
                                <scoringEntry key="RC_DATE_BEFORE_BN_SINCE" score="1000" explain="true" />
                        </scoringEntries>
                </scorer>
        </properties>
</step>

iWay Software