×

Method for searching in large databases of automatically recognized text

  • US 6,662,180 B1
  • Filed: 05/12/1999
  • Issued: 12/09/2003
  • Est. Priority Date: 05/12/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of searching for a query word from among a plurality of words in a hierarchical data structure having branch nodes and leaf nodes, each branch node representing a respective component of one or more of the words and each leaf node representing a respective one of the words, the method comprising the steps of:

  • a) selecting a root node in the hierarchical data structure as the current node;

    b) identifying all possible child nodes of the current node in the hierarchical data structure;

    c) calculating, for each of the identified child nodes, a respective estimated probability value for matching each component of the query word with the component associated with a respective one of the branch nodes in a path taken in the hierarchical data structure from the root node through the current node, wherein the estimated probability is based on a simplified error model which is independent of specific letters, until a sufficient amount of training data is available and after the sufficient amount of training data is available the estimated probability is based on a more detailed error model;

    d) adding the identified child nodes to a list of candidate nodes;

    e) selecting, from the list of candidate nodes, one node having the respective estimated probability value which is greater than any other probability value as the current node;

    f) determining if the current node is a leaf node and, if so, then determining whether to store the word representing the leaf node into a list of best matches; and

    g) repeating steps (b) through (g) until all components of the query word have been processed.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×