Text selection from images of documents using auto-completion

US 6,766,069 B1
Filed: 12/21/1999
Issued: 07/20/2004
Est. Priority Date: 12/21/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method of selecting a text region from at least one source document, comprising:

(a) in response to user input of characters defining a partial word, searching optical character recognition (OCR) results representing one or more images of the at least one source document for word matches;

(b) presenting a word match for user acceptance;

(c) repeating (b) until the word match is accepted by a user or until all word matches have been presented; and

(d) in response to the user accepting the word match, copying the word match into a target electronic document;

wherein the word match copied into the target electronic document automatically completes the partial word with characters from the text region of the at least one source document.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A user-interface for selecting text from images of documents using auto-completion is described. The auto-completion process may be used to complete words (or text sequences), phrases, sentences, paragraphs, or other groupings of words. In response to user input, the OCR results for one or more images of documents are searched. The user input may include typing in a partial word (or the initial characters in a text sequence) via an input device or alternatively, annotations made by a user on a hardcopy document prior to scanning the document. One or more word matches are presented to the user for acceptance until the user accepts a word match or until all word matches have been presented to the user. Once a user accepts a word match, the word match is copied into an electronic document such as a word processing document, spreadsheet document, or other electronic document created by an application program. The auto-completion process may be repeated until the selected text region is copied into the electronic document.

Citations

21 Claims

1. A method of selecting a text region from at least one source document, comprising:
- (a) in response to user input of characters defining a partial word, searching optical character recognition (OCR) results representing one or more images of the at least one source document for word matches;
  
  (b) presenting a word match for user acceptance;
  
  (c) repeating (b) until the word match is accepted by a user or until all word matches have been presented; and
  
  (d) in response to the user accepting the word match, copying the word match into a target electronic document;
  
  wherein the word match copied into the target electronic document automatically completes the partial word with characters from the text region of the at least one source document.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 15, 16, 17, 18, 19)
- - 2. The method of claim 1, further comprising, prior to (a):
3. The method of claim 1, wherein (a) further comprises:
- in response to the user typing characters of the partial word into the target electronic document, searching the OCR results representing one or more images for word matches.
4. The method of claim 1, wherein (a) further comprises:
- in response to the user typing characters of the partial word into the target electronic document and selecting a designated auto-completion request key, searching the OCR results representing one or more images for word matches.
5. The method of claim 1, wherein (a) further comprises:
- in response to the user input of the partial word, (i) searching the OCR results of an image representing a hardcopy document within a field of view of an image capture device;
  
  (ii) searching text in the target electronic document the user is creating; and
  
  (iii) searching the OCR results of one or more images of previously scanned hardcopy documents.
6. The method of claim 1, wherein (d) further comprises:
- in response to the user accepting the word match by typing additional information into the target electronic document, copying the word match into the target electronic document.
7. The method of claim 1, further comprising after (d):
- (e) repeating (a)-(d).
15. The method of claim 1, wherein the characters of the partial word and the word match are one of numeric, alphabetic, and alphanumeric values.
16. The method of claim 1, further comprising (e) in response to the user requesting additional words from a phrase in the at least one source document containing the word match, copying the additional words from the phrase into the target electronic document.
17. The method of claim 1, wherein (d) further comprises presenting the word match for user acceptance by highlighting the text region from the one source document.
18. The method of claim 1, wherein the one or more images of the at least one source document is one of multiple bit-per-pixel image data, binary image data, and a rendered version of text image data.
19. The method of claim 18, further comprising:
- (a1) capturing image data with an image capture device representing the at least one source document; and
  
  (a2) converting the image data representing the at least one source document into coded text using optical character recognition (OCR).

8. A method of selecting a text region from a source document, comprising:
- (a) in response to user input of a partial word, searching optical character recognition (OCR) results of an image of the source document for word matches;
  
  (b) displaying the image with all word matches highlighted with emphasis and one of the word matches highlighted with additional emphasis to indicate it is being offered to a user for acceptance; and
  
  (c) in response to the user accepting the offered word match, providing feedback to the user indicating that the offered word match represents a word selected for copying into a target electronic document.
- View Dependent Claims (9, 10)
- - 9. The method of claim 8, further comprising:
10. The method of claim 8, wherein (c) further comprises:
- (i) offering one or more words adjacent to the accepted word match to the user for acceptance; and
  
  (ii) in response to user acceptance of the one or more offered words, providing feedback to the user indicating that the one or more offered words represent one or more words selected for copying into the target electronic document.

11. A method of selecting a text region from a source document having user annotations of hand written mark-ups, comprising:
- (a) retrieving an image of the annotated source document;
  
  (b) identifying a selected text region based on the user annotations;
  
  (c) searching optical character recognition (OCR) results of the image of the annotated document for a match of the selected text region;
  
  (d) displaying to a user the match corresponding to the selected text region to indicate it is being offered for user acceptance;
  
  (e) in response to user input accepting the offered match corresponding to the selected text region, copying characters defining the selected text region into a target electronic document; and
  
  (f) providing feedback to the user to indicate the selected text region was copied into the target electronic document.
- View Dependent Claims (20)
- - 20. The method of claim 11, wherein the characters of the selected text region is one of numeric, alphabetic, and alphanumeric values.

12. A method of selecting a text region from a source document, the source document including a table with a plurality of cells, comprising:
- (a) in response to user input of a character string, searching optical character recognition (OCR) results of an image of the table for cell matches;
  
  (b) displaying a cell match to indicate it is being offered to for user acceptance; and
  
  (c) in response to user input accepting the offered cell match, copying the cell match into a target electronic document;
  
  wherein the cell match copied into the target electronic document automatically completes the character string with characters from the text region of the source document.
- View Dependent Claims (13, 14, 21)
- - 13. The method of claim 12, further comprising:
14. The method of claim 13, wherein (d) further comprises:
- (i) displaying one or more cells adjacent to the cell match for vertical cell selection and/or horizontal cell selection.
21. The method of claim 12, wherein characters of the character string and the cell match are one of numeric, alphabetic, and alphanumeric values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Xerox Corporation (Xerox Holdings Corp.)
Original Assignee
Xerox Corporation (Xerox Holdings Corp.)
Inventors
Taylor, Stuart A., Dance, Christopher R., Newman, William M., Taylor, Alex S.
Primary Examiner(s)
Mehta, Bhavesh M.
Assistant Examiner(s)
AZARIAN, SEYED H

Application Number

US09/469,958
Time in Patent Office

1,673 Days
Field of Search

382/203, 382/177, 382/229, 382/181, 382/224, 382/173, 382/175, 382/309, 382/187, 382/310, 382/188, 382/311, 382/189, 382/305, 382/306, 382/231, 707/3, 707/6, 707/512, 707/513, 707/2, 704/8, 704/9, 704/251, 704/270, 358/1.6, 358/406
US Class Current

382/309
CPC Class Codes

G06F 40/117   Tagging; Marking up details...

G06F 40/169   Annotation, e.g. comment da...

G06F 40/274   Converting codes to words; ...

G06V 30/142   using hand-held instruments...

Text selection from images of documents using auto-completion

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Text selection from images of documents using auto-completion

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links