Identifying matching canonical documents consistent with visual query structural information

US 8,811,742 B2
Filed: 12/01/2011
Issued: 08/19/2014
Est. Priority Date: 12/02/2009
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method of processing a visual query performed by a server system having one or more processors and memory storing one or more programs for execution by the one or more processors, the method comprising:

at the server system;

receiving a visual query from a client system distinct from the server system, the visual query including an image;

performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the image of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the image of the visual query, the structural information specifying a position of at least one of the plurality of textual characters with respect to one or more reference point elements in the image of the visual query;

scoring each textual character in the plurality of textual characters;

identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the image of the visual query;

retrieving, using the one or more high quality textual strings and the structural information, a canonical document that includes the one or more high quality textual strings at a location in the canonical document that is consistent with the structural information; and

sending at least a portion of the canonical document to the client system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system.

Citations

24 Claims

1. A computer-implemented method of processing a visual query performed by a server system having one or more processors and memory storing one or more programs for execution by the one or more processors, the method comprising:
- at the server system;
  
  receiving a visual query from a client system distinct from the server system, the visual query including an image;
  
  performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the image of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the image of the visual query, the structural information specifying a position of at least one of the plurality of textual characters with respect to one or more reference point elements in the image of the visual query;
  
  scoring each textual character in the plurality of textual characters;
  
  identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the image of the visual query;
  
  retrieving, using the one or more high quality textual strings and the structural information, a canonical document that includes the one or more high quality textual strings at a location in the canonical document that is consistent with the structural information; and
  
  sending at least a portion of the canonical document to the client system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 22)
- - 2. The method of claim 1, wherein the structural information further specifies one or more of:
    - relative positions of the textual characters in the image of the visual query, relative sizes of the textual characters in the image of the visual query, an ordering of the textual characters in the image of the visual query, a count of the textual characters in the image of the visual query, and a font category of the textual characters.
  - 3. The method of claim 1, wherein the portion of the canonical document is an image segment of the canonical document.
  - 4. The method of claim 3, wherein the image segment presented visually matches text and non-text elements of the visual query.
  - 5. The method of claim 1, wherein the portion of the canonical document is a machine readable text segment of the canonical document.
  - 6. The method of claim 1, wherein identifying the one or more high quality strings includes:
    - scoring a plurality of words each in accordance with the textual character scores of the textual characters comprising a respective word to produce word scores; and
      
      identifying, in accordance with the word scores, one or more high quality textual strings, each comprising a plurality of high quality words.
  - 7. The method of claim 1, wherein scoring of a respective textual character comprises scoring the respective textual character as either a high quality textual character or a low quality textual character.
  - 8. The method of claim 1, wherein scoring of a respective textual character includes generating a language-conditional character probability for the respective textual character indicating how consistent the respective textual character and a set of characters that precede the respective textual character in a text segment are with a respective language model.
  - 9. The method of claim 1, wherein the scoring of a respective textual character is based on both an OCR quality score of the respective textual character alone and a scoring of one or more neighboring textual characters.
  - 10. The method of claim 1, wherein the sending includes sending the visual query, a canonical document image segment, and a canonical document machine readable text segment for simultaneous presentation.
  - 22. The method of claim 1, wherein the one or more reference point elements comprise at least one of a text character, a margin of the image of the visual query, an edge of the image of the visual query, and a line break.

11. A computer-implemented method of processing a visual query performed by a server system having one or more processors and memory storing one or more programs for execution by the one or more processors, the method comprising:
- at the server system;
  
  receiving a visual query from a client system distinct from the server system;
  
  performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the visual query;
  
  scoring each textual character in the plurality of textual characters;
  
  identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual queryretrieving a canonical document that includes the one or more high quality textual strings and that is consistent with the structural information, wherein the retrieving a canonical document further includes;
  
  calculating a quality score corresponding to at least one respective high quality textual string of the one or more high quality textual strings;
  
  retrieving an image version of the canonical document if the quality score is below a predetermined value; and
  
  retrieving a machine readable text version of the canonical document if the quality score is at or above a predetermined value; and
  
  sending at least a portion of the canonical document to the client system.

12. A server system, for processing a visual query, comprising:
- one or more central processing units for executing programs;
  
  memory storing one or more programs be executed by the one or more central processing units;
  
  the one or more programs comprising instructions for;
  
  receiving a visual query from a client system, the visual query including an image;
  
  performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the image of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the image of the visual query, the structural information specifying a position of at least one of the plurality of textual characters with respect to one or more reference point elements in the image of the visual query;
  
  scoring each textual character in the plurality of textual characters;
  
  identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the image of the visual query;
  
  retrieving, using the one or more high quality textual strings and the structural information, a canonical document that includes the one or more high quality textual strings at a location in the canonical document that is consistent with the structural information; and
  
  sending at least a portion of the canonical document to the client system.
- View Dependent Claims (13, 14, 15, 16, 23)
- - 13. The system of claim 12, wherein the structural information further specifies one or more of:
    - relative positions of the textual characters in the image of the visual query, relative sizes of the textual characters in the image of the visual query, an ordering of the textual characters in the image of the visual query, a count of the textual characters in the image of the visual query, and a font category of the textual characters.
  - 14. The server system of claim 12, wherein the portion of the canonical document is an image segment of the canonical document.
  - 15. The server system of claim 14, wherein the image segment presented visually matches text and non-text elements of the visual query.
  - 16. The server system of claim 12, wherein the portion of the canonical document is a machine readable text segment of the canonical document.
  - 23. The server system of claim 12, wherein the one or more reference point elements comprise at least one of a text character, a margin of the image of the visual query, an edge of the image of the visual query, and a line break.

17. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
- receiving a visual query from a client system, the visual query including an image;
  
  performing optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters including a plurality of textual characters in a contiguous region of the image of the visual query, and structural information associated with the plurality of textual characters in the contiguous region of the image of the visual query, the structural information specifying position of the plurality of textual characters with respect to one or more reference point elements in the image of the visual query;
  
  scoring each textual character in the plurality of textual characters;
  
  identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the image of the visual query;
  
  retrieving, using the one or more high quality textual strings and the structural information, a canonical document that includes the one or more high quality textual strings at a location in the canonical document that is consistent with the structural information; and
  
  sending at least a portion of the canonical document to the client system.
- View Dependent Claims (18, 19, 20, 21, 24)
- - 18. The non-transitory computer readable storage medium of claim 17, wherein the structural information further specifies one or more of:
    - relative positions of the textual characters in the image of the visual query, relative sizes of the textual characters in the image of the visual query, an ordering of the textual characters in the image of the visual query, a count of the textual characters in the image of the visual query, and a font category of the textual characters.
  - 19. The non-transitory computer readable storage medium of claim 17, wherein the portion of the canonical document is an image segment of the canonical document.
  - 20. The non-transitory computer readable storage medium of claim 19, wherein the image segment presented visually matches text and non-text elements of the visual query.
  - 21. The non-transitory computer readable storage medium of claim 17, wherein the portion of the canonical document is a machine readable text segment of the canonical document.
  - 24. The non-transitory computer readable storage medium of claim 17, wherein the one or more reference point elements comprise at least one of a text character, a margin of the image of the visual query, an edge of the image of the visual query, and a line break.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Petrou, David, Popat, Ashok C., Casey, Matthew R.
Primary Examiner(s)
Werner, Brian P

Application Number

US13/309,471
Publication Number

US 20120128251A1
Time in Patent Office

992 Days
Field of Search

None
US Class Current

382/187
CPC Class Codes

G06F 16/5846   using extracted text

G06F 16/93   Document management systems

G06F 16/951   Indexing; Web crawling tech...

G06V 30/10   Character recognition

G06V 30/133   Evaluation of quality of th...

G06V 30/262   using context analysis, e.g...

G06V 30/413   Classification of content, ...

Identifying matching canonical documents consistent with visual query structural information

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Identifying matching canonical documents consistent with visual query structural information

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links