×

Determining word boundary likelihoods in potentially incomplete text

  • US 8,364,709 B1
  • Filed: 11/22/2010
  • Issued: 01/29/2013
  • Est. Priority Date: 11/22/2010
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • a data processing apparatus; and

    a computer storage medium encoded with a computer program, the program comprising data processing apparatus instructions that when executed by the data processing apparatus cause the data processing apparatus to perform operations comprising;

    accessing queries stored in query logs, each query being one or more characters in a first sequence constituting one or more words in a second sequence;

    for each query;

    selecting query sequences from the query, each query sequence being at least a portion of a word n-gram, the word n-gram being a subsequence of up to n words selected from the second sequence of words of the query, and for each selected query sequence;

    determining one or more query sequence keys for the query sequence;

    determining at least one of a word boundary count and a non-word boundary count for each query sequence key, each word-boundary count and non-word boundary count being dependent on the context of the query sequence; and

    associating, in a data storage device, the at least one word boundary count and non-word boundary counts with each query sequence key.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×