Natural language parsers to normalize addresses for geocoding
First Claim
Patent Images
1. A method for normalizing an input address, comprising:
- under control of a computer system comprising computer hardware;
identifying an input address indicative of a physical address;
selecting a predictive model corresponding to the input address from a plurality of predictive models, each of the plurality of predictive models being an automated country-specific natural language parser uniquely defined for a corresponding one of a plurality of countries and jurisdictions, the selected predictive model comprising a graph having address field nodes and edges connecting the address field nodes, each of at least some of the address field nodes comprising an address field and a corresponding set of one or more address field classifications each assigned a first probability value, each of at least some of the edges assigned a second probability value; and
executing the selected predictive model to determine an address field classification for the input address from one of the address fields in the graph based at least on the first and second probability values of the address field nodes and the edges that correspond to the address field classification.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a technique for building natural language parsers by implementing a country and/or jurisdiction specific set of training data that is automatically converted during a build phase to a respective predictive model, i.e., an automated country specific natural language parser. The predictive model can be used without the training data to quantify any input address. This model may be included as part of a larger Geographic Information System (GIS) data-set or as a stand alone quantifier. The build phase may also be run on demand and the resultant predictive model kept in temporary storage for immediate use.
47 Citations
13 Claims
-
1. A method for normalizing an input address, comprising:
under control of a computer system comprising computer hardware; identifying an input address indicative of a physical address; selecting a predictive model corresponding to the input address from a plurality of predictive models, each of the plurality of predictive models being an automated country-specific natural language parser uniquely defined for a corresponding one of a plurality of countries and jurisdictions, the selected predictive model comprising a graph having address field nodes and edges connecting the address field nodes, each of at least some of the address field nodes comprising an address field and a corresponding set of one or more address field classifications each assigned a first probability value, each of at least some of the edges assigned a second probability value; and executing the selected predictive model to determine an address field classification for the input address from one of the address fields in the graph based at least on the first and second probability values of the address field nodes and the edges that correspond to the address field classification. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A system for normalizing address input, the system comprising:
computer hardware programmed to; identify an input address indicative of a physical address; select a predictive model corresponding to the input address from a plurality of predictive models, each of the plurality of predictive models representing an automated country-specific natural language parser uniquely defined for a corresponding one of a plurality of jurisdictions, the selected predictive model comprising (1) address field data including address fields and address field classifications corresponding to the address fields and (2) probability values assigned to the address field data; and execute the selected predictive model to match a selected one of the address field classification with the input address based at least on the probability values. - View Dependent Claims (8, 9, 10, 11, 12, 13)
Specification