×

Using an ID domain to improve searching

  • US 8,538,964 B2
  • Filed: 12/08/2011
  • Issued: 09/17/2013
  • Est. Priority Date: 07/25/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • under control of a computing device having one or more processors with executable instructions,segmenting text in an image of a document into elements, each element representing a character in the text;

    grouping similar elements into clusters and assigning each cluster an identifier;

    replacing each element in a cluster of similar elements with the identifier allocated to the cluster of similar elements;

    ordering the identifiers within the document according to an order of the characters in the text;

    creating an index of identifiers in the document;

    receiving a text query and converting the text query into an image of the text query;

    segmenting the image of the text query into elements and matching each element to at least one cluster using a cluster table, the cluster table comprising mappings between identifiers and element characteristics, at least a first element matching to at least two clusters;

    replacing each element in the image of the text query with at least one identifier based on the matching to formulate a query defined in terms of identifiers, replacing each element in the image of the text query including replacing the first element with at least two identifiers based on the matching; and

    searching the index of identifiers using the query defined in terms of identifiers.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×