System and method for contextual and free format matching of addresses

US 8,595,219 B1
Filed: 06/29/2012
Issued: 11/26/2013
Est. Priority Date: 05/16/2012
Status: Active Grant

First Claim

Patent Images

1. A method for matching a first address and a second address using a processor, the first address and the second address each associated with one or more consumers, the method comprising:

receiving the first address and the second address at the processor;

contextually identifying a first address part of the first address and a second address part of the second address, using the processor, wherein the first address part and the second address part each have an address part type that is alike, and wherein contextually identifying comprises;

deterministically evaluating a first string in the first address to identify the first address part and a second string in the second address to identify the second address part, using the processor; and

extracting first data from the first address and second data from the second address using the processor, based on the address part type of the first address part and the second address part;

normalizing, using the processor, the first address part to produce a first normalized address part and the second address part to produce a second normalized address part, based on a normalization rule;

comparing the first normalized address part and the second normalized address part, using the processor;

calculating a contextual matching score, based on comparing the first normalized address part and the second normalized address part, using the processor;

performing a free format token analysis of the first address and the second address, using the processor;

calculating a free format matching score, based on performing the free format token analysis of the first address and the second address, using the processor;

calculating an address likeness score, based on the contextual matching score and the free format matching score, using the processor; and

transmitting the address likeness score from the processor.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for the matching addresses is provided. Addresses may be received from a search engine or other source for purposes of matching. Address parts in the addresses may be contextually identified. Identified address parts, including their associated data, that have address part types that are alike may be compared to one another and a contextual matching score may be calculated and assigned. A free format token analysis of the addresses may also be performed in parallel with, before, or after, the contextual identification, and a free format matching score may be calculated. An address likeness score may be calculated and assigned based on the contextual matching score and the free format matching score.

Citations

18 Claims

1. A method for matching a first address and a second address using a processor, the first address and the second address each associated with one or more consumers, the method comprising:
- receiving the first address and the second address at the processor;
  
  contextually identifying a first address part of the first address and a second address part of the second address, using the processor, wherein the first address part and the second address part each have an address part type that is alike, and wherein contextually identifying comprises;
  
  deterministically evaluating a first string in the first address to identify the first address part and a second string in the second address to identify the second address part, using the processor; and
  
  extracting first data from the first address and second data from the second address using the processor, based on the address part type of the first address part and the second address part;
  
  normalizing, using the processor, the first address part to produce a first normalized address part and the second address part to produce a second normalized address part, based on a normalization rule;
  
  comparing the first normalized address part and the second normalized address part, using the processor;
  
  calculating a contextual matching score, based on comparing the first normalized address part and the second normalized address part, using the processor;
  
  performing a free format token analysis of the first address and the second address, using the processor;
  
  calculating a free format matching score, based on performing the free format token analysis of the first address and the second address, using the processor;
  
  calculating an address likeness score, based on the contextual matching score and the free format matching score, using the processor; and
  
  transmitting the address likeness score from the processor.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein contextually identifying comprises:
    - matching a first key word in the first address using the processor, the first key word for identifying the address part type of the first address part; and
      
      matching a second key word in the second address using the processor, the second key word for identifying the address part type of the second address part;
      
      wherein the first data and the second data are associated with the address part type.
  - 3. The method of claim 2, wherein matching the first key word comprises matching an acronym in the first address, using the processor.
  - 4. The method of claim 2, wherein extracting comprises:
    - extracting the first data following or before the first key word of the first address, using the processor; and
      
      extracting the second data following or before the second key word of the second address, using the processor.
  - 5. The method of claim 1, wherein the address part type of the first address part and the second address part comprises one or more of an apartment number, a house number, a post office box, a floor, a building, a complex, a street, a geographical direction, a district, a tehsil, a stand number, a barrio, a village, a suburb, a town, a city, or a state.
  - 6. The method of claim 1, wherein comparing comprises comparing first data from the first address and second data from the second address, using the processor, wherein the first data and the second data are associated with the address part type of the first address part and the second address part.
  - 7. The method of claim 1, wherein calculating the contextual matching score comprises:
    - calculating a subscore for the address part type of the first address part and the second address part, using the processor;
      
      weighting the subscore based on a specificity of the address part type, using the processor; and
      
      calculating the contextual matching score based on the weighted subscore, using the processor.
  - 8. The method of claim 1, wherein performing the free format token analysis comprises:
    - comparing variations of a string in the first address and the second address, using the processor; and
      
      performing a phonetic analysis on the first address and the second address, using the processor.
  - 9. The method of claim 1, wherein calculating the address likeness score comprises:
    - weighting one or more of the contextual matching score or the free format matching score, using the processor; and
      
      calculating the address likeness score based on one or more of the weighted contextual matching score or the weighted free format matching score, using the processor.
  - 10. The method of claim 1, wherein transmitting the address likeness score comprises:
    - determining an address matching strength, based on the address likeness score, using the processor; and
      
      transmitting the address matching strength from the processor.
  - 11. The method of claim 1, further comprising:
    - determining whether to merge a first database record and a second database record, based on the address likeness score, using the processor, wherein the first database record is associated with the first address and the second database record is associated with the second address; and
      
      transmitting a merge flag from the processor, the merge flag indicating that the first database record and the second database record are matches.

12. A method for matching an address with a plurality of candidate addresses, using a processor, the address and the plurality of candidate addresses associated with one or more consumers, the method comprising:
- receiving the address and the plurality of candidate addresses at the processor;
  
  identifying an address part of the address and a plurality of candidate address parts of each of the plurality of candidate addresses, based on a contextual identification analysis of the address and the plurality of candidate addresses, using the processor, wherein the address part and the plurality of candidate address parts each have an address part type that is alike, and wherein identifying comprises;
  
  deterministically evaluating a first string in the address to identify the address part and a plurality of strings in each of the plurality of candidate addresses to identify the plurality of candidate address parts, using the processor; and
  
  extracting the address data from the address and the plurality of candidate address data from the plurality of candidate addresses using the processor, based on the address part type;
  
  comparing address data with a plurality of candidate address data, using the processor, wherein the address data and the plurality of candidate address data is associated with the address part type;
  
  calculating a contextual matching score, based on comparing the address data with the plurality of candidate address data, using the processor;
  
  performing a free format token analysis of the address and the plurality of candidate addresses, using the processor;
  
  calculating a free format matching score, based on performing the free format token analysis, using the processor;
  
  calculating an address likeness score, based on the contextual matching score and the free format matching score, using the processor; and
  
  transmitting one or more matching addresses from the plurality of candidate addresses from the processor, based on the address likeness score.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The method of claim 12, wherein the address data and the plurality of candidate address data are associated with the address part type.
  - 14. The method of claim 12, wherein the address part type comprises one or more of an apartment number, a house number, a post office box, a floor, a building, a complex, a street, a geographical direction, a district, a tehsil, a stand number, a barrio, a village, a suburb, a town, a city, or a state.
  - 15. The method of claim 12:
    - further comprising normalizing, using the processor, the address part to produce a normalized address part and the plurality of candidate address parts to produce a plurality of normalized candidate address parts, based on a normalization rule;
      
      wherein comparing comprises comparing the normalized address part with the plurality of normalized candidate address parts, using the processor.
  - 16. The method of claim 12, wherein calculating the contextual matching score comprises:
    - calculating a subscore for the address part type of the address part and the plurality of candidate address parts, using the processor;
      
      weighting the subscore based on a specificity of the address part type, using the processor; and
      
      calculating the contextual matching score based on the weighted subscore, using the processor.
  - 17. The method of claim 12, wherein calculating the address likeness score comprises:
    - weighting one or more of the contextual matching score or the free format matching score, using the processor; and
      
      calculating the address likeness score based on one or more of the weighted contextual matching score or the weighted free format matching score, using the processor.
  - 18. The method of claim 12, further comprising performing the contextual identification analysis of the address and the plurality of candidate addresses, using the processor, wherein performing the contextual identification analysis comprises:
    - matching a key word in the address using the processor, the key word for identifying the address part type of the address part;
      
      matching a plurality of key words in the plurality of candidate addresses using the processor, the plurality of key words for identifying the address part type of the plurality of candidate address parts; and
      
      extracting the address data from the address and the plurality of candidate address data from the plurality of candidate addresses using the processor, based on the address part type of the address part and the plurality of candidate address parts, wherein the address data and the plurality of candidate address data are associated with the address part type.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TransUnion LLC (TransUnion Corporation)
Original Assignee
TransUnion LLC (TransUnion Corporation)
Inventors
Thompson, Douglas
Primary Examiner(s)
SYED, FARHAN M

Application Number

US13/539,009
Publication Number

US 20130311448A1
Time in Patent Office

515 Days
Field of Search
US Class Current

707/722
CPC Class Codes

G06F 16/24578   using ranking

G06F 16/3334   Selection or weighting of t...

G06F 16/9017   using directory or table lo...

G06F 16/902   using more than one table i...

G06F 16/90344   by using string matching te...

G06F 16/90348   by searching ordered data, ...

System and method for contextual and free format matching of addresses

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for contextual and free format matching of addresses

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links