×

Identifying matching canonical documents consistent with visual query structural information

  • US 8,811,742 B2
  • Filed: 12/01/2011
  • Issued: 08/19/2014
  • Est. Priority Date: 12/02/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method of processing a visual query performed by a server system having one or more processors and memory storing one or more programs for execution by the one or more processors, the method comprising:

  • at the server system;

    receiving a visual query from a client system distinct from the server system, the visual query including an image;

    performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the image of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the image of the visual query, the structural information specifying a position of at least one of the plurality of textual characters with respect to one or more reference point elements in the image of the visual query;

    scoring each textual character in the plurality of textual characters;

    identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the image of the visual query;

    retrieving, using the one or more high quality textual strings and the structural information, a canonical document that includes the one or more high quality textual strings at a location in the canonical document that is consistent with the structural information; and

    sending at least a portion of the canonical document to the client system.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×