Detailed Description of Representative Creator

This step computes a set of representative records from source data utilizing a process where records are classified by a specified rule and grouped by an appropriate key.

The records are sorted for each group, and the best records are selected for each group. The new representative record is collected from a specific data record or from aggregated values of a group of records. New values can be also stored in the original records.


Top of page

Example: Example
<step id='alg' className='cz.adastra.cif.tasks.experimental.bob.RepresentativeCreator'>
        <properties>
                <rules>
                        <rule when="system_id=1">
                                <groupBy>
                                        <key expression="group_id" />
                                </groupBy>
                                <attributeSets>
                                        //  setting attributes containing group and rule keys
                                        <attributeSet>
                                                <representativeAttributes>
                                                        <attribute name="system_id" expression="system_id" />
                                                        <attribute name="group_id" expression="group_id" />
                                                </representativeAttributes>
                                        </attributeSet>
                                        // the best name creation
                                        <attributeSet acceptanceCondition="f_name is not null and l_name is not null">
                                                <selectionRules>
                                                        <orderBy expression="name_priority" orderDescending="true" />
                                                </selectionRules>
                                                <representativeAttributes>
                                                        <attribute name="full_name" expression="f_name + ' ' + l_name" resultIfNotFound="'BAD_NAME'" />
                                                        <attribute name="name" expression="f_name" />
                                                        <attribute name="surname" expression="l_name" />
                                                </representativeAttributes>
                                                <instanceAttributes>
                                                        <attribute name="name_bob_id" expression="best.instance_id" />
                                                        <attribute name="best_name" expression="best.f_name + ' ' + best.l_name" resultIfNotFound="nvl(f_name, l_name, 'NA')" />
                                                </instanceAttributes>
                                        </attributeSet>
                                        // the best address creation
                                        <attributeSet>
                                                <selectionRules>
                                                        <orderBy expression="score_address" />
                                                </selectionRules>
                                                <representativeAttributes>
                                                        <attribute name="street" expression="street" />
                                                        <attribute name="city" expression="city" />
                                                        <attribute name="zip" expression="zip" />
                                                </representativeAttributes>
                                                <instanceAttributes>
                                                        <attribute name="addr_bob_id" expression="best.instance_id" />
                                                        <attribute name="addr_newer_than_bob" expression="this.updated &gt; best.updated" />
                                                </instanceAttributes>
                                        </attributeSet>
                                        // aggregate functions used for creating representative attributes
                                        <attributeSet>
                                                <selectionRules>
                                                        <orderBy expression="score_instance" />
                                                </selectionRules>
                                                <representativeAttributes>
                                                        <attribute name="salary" expression="median(salary)" />
                                                        <attribute name="last_update" expression="maximum(updated)" />
                                                        <attribute name="group_size" expression="count()" />
                                                </representativeAttributes>
                                        </attributeSet>
                                </attributeSets>
                        </rule>
                </rules>
        </properties>
</step>

iWay Software