Detailed Description of Dictionary Lookup Identifier
Identifier that uses dictionary based lookups. Identifies
the input record possibly down to its primary key. It writes address
proposals determined during the processing into a dedicated output
endpoint (out_proposals). The step proceeds as follows:
- Examines
input stream - it searches for occurrences of known values of the
reference data component. Uses dictionaries defined in the reference
data.
- Matches the
found values with supporting vectors and performs lookups into indices
for proposals.
- Each vector
from step 2 possibly returns a bulk of proposals. Each bulk is processed independently
- each proposal is compared with input protoaddress and scored by means
of user defined scoring and the proposal with the best score is
selected as the best proposal. If there is a proposal with a score
less than or equal to the predefined value (see SupportingVectorDefinition),
it is selected as the result and no more vectors are processed.
When comparing proposal components with input text approximately, spaces
occurring between two letters of different types (such as between
a dot and a letter) can be missing in the input text and such cases
are NOT considered as an error (no scoring case can be triggered
by this situation).
Note: each vector may map onto the input string more than
once. See the description of SupportingVectorCase.