mHUB - Matching Matrices

 

mHUB uses personal name and organization name matching matrices to decide the level of match records should achieve.

Location: <settings><advanced><matchingRules>

<matchingRules>
<xxxLevel>
<nameMatchingMatrix>
<matching-matrix type="name">...</matching-matrix>
</nameMatchingMatrix>
<organizationMatchingMatrix>
<matching-matrix

type="organization">...</matching-matrix>
</organizationMatchingMatrix>
<xxxLevel>
<matchingRules>

OR

<matchingRules>
<xxxLevel>
<nameMatchingMatrix>filename</nameMatchingMatrix>
<organizationMatchingMatrix>filename</organizationMatchingMatrix>
<xxxLevel>
<matchingRules>

where xxxLevel indicates the matching level (individualLevel, businessLevel, etc.) and filename is the pathname of an XML file containing the matching matrix in the form (see below for further details):

<matching-matrix type="name|organization">...</matching-matrix>

Name Matching Matrix

The name matching matrices - of which there is one for each of the five matching levels (individual, family, address, business, custom) - are three-dimensional matrices. The three dimensions represent the individual name fields: last name, first name, middle name. The matrix maps the match type for these individual name fields (equal, both_empty, one_empty, sounds_equal, etc.) to an overall match level (sure, likely, possible, etc.). Each match level has a specific score value associated with it which is then multiplied by the relevant level's component weight to produce an overall score.

<matching-matrix type="name">
<lastnames match="equal">
<firstnames match="equal">
<middlenames match="equal">sure</middlenames>
<middlenames match="both_empty">sure</middlenames>
<middlenames match="one_empty">sure</middlenames>
<middlenames match="approx">likely</middlenames>
<middlenames match="contains">likely</middlenames>
<middlenames match="unequal">possible</middlenames>
</firstnames>
<firstnames match="sounds_equal">
...
</firstnames>
...
</lastnames>
<lastnames match="sounds_equal">
...
</lastnames>
...
</matching-matrix>

match: The type of match:

  • equal
  • sounds_equal (not middlenames)
  • both_empty (not lastnames)
  • one_empty (not lastnames)
  • approx
  • sounds_approx (not middlenames)
  • contains
  • unequal

Valid values for the matrix entries are:

  • sure
  • likely
  • possible
  • zero

Additionally, fractional values between 0 and 1 can be used in place of these identifiers, to indicate a percentage of the Sure weight (for example, a value of 0.8 with a Sure weight of 60 would result in a name score of 48).

The above example indicates that where the last names for 2 records being compared are the same and the first names are also the same, then the ultimate result of the comparison would depend upon the data in the middle name fields, e.g. for middle names that are also equal, then the result is sure. The actual score that this sure match would be worth would depend on the matching weights defined in your configuration for the sure match on name (at the corresponding matching level).

Organization Matching Matrix

The organization matching matrices are the same as the name matching matrices except that the three dimensions are name1, name2, and name3 - representing the first three words of the organization's name - instead of lastname, firstnames, and middlename.

<matching-matrix type="organization">
<name1 match="equal">
<name2 match="equal">
<name3 match="equal">sure</name3>
<name3 match="both_empty">sure</name3>
<name3 match="one_empty">sure</name3>
<name3 match="approx">sure</name3>
<name3 match="contains">likely</name3>
<name3 match="unequal">possible</name3>
</name2>
<name2 match="sounds_equal">
...
</name2>
...
</name1>
<name1 match="sounds_equal">
...
</name1>
...
</matching-matrix>