Match Logic for Vendor
Currently, on-prem vendor uses the SSA-Name3 algorithm to match logic. This internally uses different techniques. Overview
- Phonetic matching: SSA Name3 can match records based on the phonetic similarity of the data. This is useful for matching records that contain different spellings of the same name or that are in different languages. Some of the phonetic matching algorithms used in SSA Name3 include:
- Soundex-Soundex is a phonetic algorithm that converts words into a four-digit code based on the pronunciation of the word. For example, the words "John Smith" and "Jon Smith" would both convert to the Soundex code "J523."
- Double Metaphone-Double Metaphone is a phonetic algorithm that converts words into a two-digit code based on the pronunciation of the word. For example, the words "John Smith" and "Jon Smith" would both convert to the Double Metaphone code "JN."
- Cologne Phonetic-Cologne Phonetic is a phonetic algorithm that converts words into a two-digit code based on the pronunciation of the word in German. For example, the words "John Smith" and "Jon Smith" would both convert to the Cologne Phonetic code "JN."
- Exact matching: SSA Name3 can also match records based on the exact match of the data. This is useful for matching records that contain the same data, such as the same name and address.
- Fuzzy matching: SSA Name3 can also match records based on a fuzzy match of the data. This is useful for matching records that contain similar data, but not the same data. For example, SSA Name3 can match records that contain the names "John Smith" and "Jon Smith." Fuzzy match algorithms are:
- Jaro-Winkler
- Levenshtein distance
- Dice coefficient
- Needleman-Wunsch algorithm
On-prem Approach:
- While creating the vendor it will take the vendor name and it will generate the search tokens based on phonetic technic and it will store in
<entity\_name>_STRP table. - When we look for match based on vendor name , it will check in STRP table .
We need to implement a Soundex technique with Postgres sql/Elastic Search as POC
Was this page helpful?