×

Training a probabilistic spelling checker from structured data

  • US 9,558,179 B1
  • Filed: 12/05/2013
  • Issued: 01/31/2017
  • Est. Priority Date: 01/04/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for generating a language model for computing probabilities of occurrence of queries, comprising:

  • accessing a database comprising a plurality of entities, each entity having one or more names and an entity type;

    accessing a query log comprising queries previously entered by users, a plurality of the queries including names of ones of the entities in the database;

    generating, from the query log, a template distribution quantifying probabilities that entity types of the entities named in the queries correspond to ones of a plurality of query templates, each query template comprising an ordered set of the entity types appearing in the database;

    generating the language model from the template distribution, the language model comprising a set of combinations of names of the entities and associated scores, the scores based on probabilities of occurrence of the combinations in a query; and

    storing the language model on a computer readable storage device;

    wherein generating, from the query log, a template distribution comprises identifying, for each of the plurality of queries, a query template matching each query based on the names of the entities associated with the queries and an ordering of the names of the entities associated with the queries;

    and wherein generating the template distribution further comprises determining, for each distinct query template, a count of the plurality of the queries that correspond to the template.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×