Scoring Conventions

It is a best practice to score data quality errors as either small or big. All small errors are scored as 10, and big errors are scored as 1000.

How do you distinguish between a small and a big error?

A small error occurs when you transform a field (leave spaces, change the structure, or add information like the international phone prefix for a phone number). Application of safe replacement is also a reason for the scoring.

A big error occurs when a value is completely wrong or inconsistent. For example, the NHS number is supplied, but the structure or checksum is wrong. Also, a serious error is the inability to validate a United Kingdom address (its consistency). Another serious error is a mandatory field that is empty.

Determine which type of error is significant when deciding how to score it. You may need to discuss this issue with the business users to assign the proper score, since the severity of the error may be business dependent.

Numerous small errors can increase the overall score on the instance or record level. That causes classification of the instance-level or record-level score as a big error.

For both scoring and explanation, it is a best practice to define score columns for each attribute or attribute group (for example, names, NHS, or gender), and then aggregate all the partial information into the instance (overall) score and explanation.


iWay Software