mSQL - SSIS Tutorial - Tight Business Matching

Previously we suggested you could adjust the minimum score to report, the simple approach is to simply raise or lower it

 

By default, without running address validation beforehand – neither of these business matches may show up,  they’re just below the minimum threshold.

 

When we match on a company, we generally look left to right on the company and score it, score the address, and score the zip, then add it all up.

 

Deming solar panels        

201 W SPRUCE Ave

 

DEMING

NM 

88030

deming center                   

200 E SPRUCE ST 

 

demin

NM 

88031

 

Walmart

100 CHIPPEWA TOWN CTR 

BEAVER FALLS, 

PA 

15010

15010

The Wal-Mart Supercenter of Beaver Falls

100 CHIPPAWA TOWN CENTER 

BEAVERFALLS, 

PA 

15001

15001

 

By default we’ll strip out all business and noise words, so we’re looking at these as Deming and Deming, walmart and wal mart - We call this ‘loose business matching’

 

(if you’re a more advanced user, you can modify our names & words to control what we ignore and what we won’t ignore, but that’s something you’re better talking to a support rep about before making changes).

 

For most clients, if you wanted to just lower the minimum to 71 to let in that wal mart match, without letting in the Deming example which is a false match, you could just enable ‘must match premise’.

 

If you’re not like most clients, then you want to switch to what we call ‘tight business matching’

 

There’s 4 changes we suggest – we’ll go through all 4

1) lets start out by making a copy of our template and setting up a new template just for tight business matching – you want a separate template for tight business matching as one of the settings changes can only be controlled through the template

2) open up generate keys and go into your template manager, make a copy of your template and rename it appropriately

 

click edit – this will launch notepad

  • first lets find the datpath, change it to end with \tight



This directory contains data files that have all of our matching equivalents, so we know business words like autos and motors are interchangeable for example, while words like The,Of,incorporated, etc are noise words that we ignore, we want to use the tight names and words to ensure we’re ignoring those noise words like Inc and Corp still.

Next lets find the the setting <useequivalentname> - this in conjunction with the names and words controls what words we strip out – there’s two of these in the template file, only change the one near <organization>

 

  • Right below – there’s a line called <normalizationtruncation> - change the 0 to a  9

    <normalizationTruncation>0</normalizationTruncation> <!--integer-->
    This tells us to look at more of the company

    so say we had wal-mart supercenter of beaver falls – normally we’d look at the first 3 significant parts  Wal, Mart, and Supercenter - and ignore the beaver falls part, as generally the more words in a company name, the less significant they are.

    but what if we also had a Wal Mart Supercenter of New Castle.  We don’t want that matching to the Beaver falls wal mart, so normalization truncation will take the first x letters of those extra words, so we put emphasis on the whole company name, instead of the first 3 significant words.  This makes for tighter matches.
  • Finally, there’s a change to the matching matrix – the matrix’s control what a match scores

  • So if we match wal mart to wal mart , that = sure match

Wal mart supercenter to wal mart = likely
supercenter at wal mart to  wal mart  = possible (transposed company words)
target supercenter to wal mart supercenter = no match

locate this node with a search  </businessLevel>

right above it, there’s this node

<organizationMatchingMatrix>C:\mSQL\config\matchingMatrices\businessLevel\organizationMatchingMatrix.xml</organizationMatchingMatrix> <!--can specify an absolute or a relative (to config) pathname-->

replace organizationMatchingMatrix.xml with  organizationMatchingMatrixTight.xml

 

  • Save your changes, then choose the new template, you’ll get prompted with this (only click yes if you’re in a generate keys task and purposely want to change it to tight matching)

 

 

If you’re not comfortable editing the XML, or don’t want to lose other changes you’ve done, you can make 3 of the 4 changes right in the generatekeys task

Click advanced if it's not showing already, here you can set use equivalent name to yes, and change the dat directory

 

 


under further settings you’ll see normalization truncation

 

 

The matrix path can only be controlled via the template

 

Ideally with this 4 changes, your business matches will have more good matches, less false matches, and be much closer to something you can automate on a regular basis.

 

Most clients will also enable the must match premise or other subtle changes to the low volume keys or use the ‘high volume’ keys in conjunction with these changes.

 

Because you tightened the matches, you may be missing a couple, but you can always run a second pass on loose business matching to see what you may be missing.