×

Training a probabilistic spelling checker from structured data

  • US 8,626,681 B1
  • Filed: 01/04/2011
  • Issued: 01/07/2014
  • Est. Priority Date: 01/04/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for generating a geographic language model for computing probabilities of occurrence of geographic queries, comprising:

  • accessing a geographic database comprising;

    a plurality of geographic entities, each geographic entity corresponding to a geographic region and having one or more names and an entity type, anda plurality of links between pairs of the geographic entities;

    accessing a query log comprising geographic queries previously entered by users, a plurality of the geographic queries including names of ones of the geographic entities in the geographic database;

    generating, from the query log, a template distribution quantifying probabilities that entity types of the geographic entities named in the geographic queries correspond to ones of a plurality of query templates, each query template comprising an ordered set of the entity types appearing in the geographic database;

    generating a geographic distribution from the query log quantifying probabilities of queries in the query log referencing ones of the geographic entities in the geographic database;

    generating the geographic language model from the template distribution and the geographic distribution, the geographic language model comprising a set of combinations of names of the geographic entities and associated scores, the scores based on probabilities of occurrence of the combinations in a geographic query; and

    storing the geographic language model on a computer readable storage device.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×