Using extracted image text

US 8,744,173 B2
Filed: 09/15/2012
Issued: 06/03/2014
Est. Priority Date: 06/29/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A computer-implemented method comprising:

dividing an image into a plurality of sub-regions;

identifying two or more adjacent sub-regions of the image that contain text, wherein the identified adjacent sub-regions share overlapping image pixels;

combining the identified adjacent sub-regions into a candidate text region;

determining a minimum size for candidate text regions;

determining that the candidate text region is smaller than the determined minimum size for candidate text regions; and

based on determining that the candidate text region is smaller than the minimum size for candidate text regions, bypassing optical character recognition for the candidate text region.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus including computer program products for using extracted image text are provided. In one implementation, a computer-implemented method is provided. The method includes receiving an input of one or more image search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image search terms, and presenting the image.

54 Citations

View as Search Results

33 Claims

1. A computer-implemented method comprising:
- dividing an image into a plurality of sub-regions;
  
  identifying two or more adjacent sub-regions of the image that contain text, wherein the identified adjacent sub-regions share overlapping image pixels;
  
  combining the identified adjacent sub-regions into a candidate text region;
  
  determining a minimum size for candidate text regions;
  
  determining that the candidate text region is smaller than the determined minimum size for candidate text regions; and
  
  based on determining that the candidate text region is smaller than the minimum size for candidate text regions, bypassing optical character recognition for the candidate text region.

2. The method of claim 1 wherein identifying the adjacent sub-regions that contain text comprises:
- extracting one or more features from each sub-region; and
  
  providing the extracted features to a trained classifier to determine whether each sub-region contains text.

3. The method of claim 2, wherein the classifier has been trained on images of city street scenes, and wherein each image of a city street scene includes identified regions of text and regions of non-text.

4. The method of claim 2, further comprising:
- scaling the image to multiple sizes prior to extracting the one or more features; and
  
  determining that the sub-region contains text at two or more sizes.

5. The method of claim 2, wherein extracting features from each sub-region includes detecting corner features within the sub-region.

6. The method of claim 2, wherein extracting features from each sub-region includes computing projection profiles in each sub-region.

7. The method of claim 1, further comprising:
- determining that a different second candidate text region is larger than the minimum size for candidate text regions; and
  
  in response to determining that the different second candidate text region is larger than the minimum size for candidate text regions, performing optical character recognition for the second candidate text region.

8. The method of claim 1, further comprising:
- receiving ranging data associated with the image;
  
  generating a planar map of the image using the ranging data; and
  
  determining that the candidate text region of the image is not located on a single plane,wherein bypassing optical character recognition for the candidate text region is further based on determining that the candidate text region of the image is not located on a single plane.

9. The method of claim 8, wherein the ranging data comprises a distance from a camera position when the image was taken to an object shown in the image.

10. The method of claim 8, wherein generating a planar map comprises decomposing the image into planar and non-planar regions, and wherein determining that the candidate text region is not located on a single plane is based on determining that the candidate text region corresponds to a non-planar region.

11. The method of claim 1, further comprising:
- correcting perspective distortion in the image.

12. The method of claim 1, wherein determining a minimum size for candidate text regions comprises:
- determining a number of false positive results based on the minimum size for candidate text regions;
  
  determining that the number of false positive results satisfies a threshold; and
  
  in response to determining that the number of false positive results satisfies a threshold, determining an increased minimum size for candidate text regions.

13. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  dividing an image into a plurality of sub-regions;
  
  identifying two or more adjacent sub-regions of the image that contain text, wherein the identified adjacent sub-regions share overlapping image pixels;
  
  combining the identified adjacent sub-regions into a first candidate text region;
  
  determining a minimum size for candidate text regions;
  
  determining that the candidate text region is smaller than the determined minimum size for candidate text regions; and
  
  based on determining that the candidate text region is smaller than the minimum size for candidate text regions, bypassing optical character recognition for the candidate text region.

14. The system of claim 13 wherein identifying the adjacent sub-regions that contain text comprises:
- extracting one or more features from each sub-region; and
  
  providing the extracted features to a trained classifier to determine whether each sub-region contains text.

15. The system of claim 14, wherein the classifier has been trained on images of city street scenes, and wherein each image of a city street scene includes identified regions of text and regions of non-text.

16. The system of claim 14, wherein the operations further comprise:
- scaling the image to multiple sizes prior to extracting the one or more features; and
  
  determining that the sub-region contains text at two or more sizes.

17. The system of claim 14, wherein extracting features from each sub-region includes detecting corner features within the sub-region.

18. The system of claim 14, wherein extracting features from each sub-region includes computing projection profiles in each sub-region.

19. The system of claim 13, wherein the operations further comprise:
- determining that a different second candidate text region is larger than the minimum size for candidate text regions; and
  
  in response to determining that the different second candidate text region is larger than the minimum size for candidate text regions, performing optical character recognition for the second candidate text region.

20. The system of claim 13, wherein the operations further comprise:
- receiving ranging data associated with the image;
  
  generating a planar map of the image using the ranging data; and
  
  determining that the candidate text region of the image is not located on a single plan,wherein bypassing optical character recognition for the candidate text region is further based on determining that the candidate text region of the image is not located on a single plane.

21. The system of claim 20, wherein generating a planar map comprises decomposing the image into planar and non-planar regions, and wherein determining that the candidate text region is not located on a single plane is based on determining that the candidate text region corresponds to a non-planar region.

22. The system of claim 13, wherein determining a minimum size for candidate text regions comprises:
- determining a number of false positive results based on the minimum size for candidate text regions;
  
  determining that the number of false positive results satisfies a threshold; and
  
  in response to determining that the number of false positive results satisfies a threshold, determining an increased minimum size for candidate text regions.

23. A non-transitory computer readable medium comprising instructions that, when executed by one or more data processing apparatus, cause the one or more data processing apparatus to perform operations comprising:
- dividing an image into a plurality of sub-regions;
  
  identifying two or more adjacent sub-regions of the image that contain text, wherein the identified adjacent sub-regions share overlapping image pixels;
  
  combining the identified adjacent sub-regions into a candidate text region;
  
  determining a minimum size for candidate text regions;
  
  determining that the candidate text region is smaller than the determined minimum size for candidate text regions; and
  
  based on determining that the candidate text region is smaller than the minimum size for candidate text regions, bypassing optical character recognition for the candidate text region.

24. The computer readable medium of claim 23 wherein identifying the adjacent sub-regions that contain text comprises:
- extracting one or more features from each sub-region; and
  
  providing the extracted features to a trained classifier to determine whether each sub- region contains text.

25. The computer readable medium of claim 24, wherein the classifier has been trained on images of city street scenes, and wherein each image of a city street scene includes identified regions of text and regions of non-text.

26. The computer readable medium of claim 24, wherein the operations further comprise:
- scaling the image to multiple sizes prior to extracting the one or more features; and
  
  determining that the sub-region contains text at two or more sizes.

27. The computer readable medium of claim 24, wherein extracting features from each sub-region includes detecting corner features within the sub-region.

28. The computer readable medium of claim 24, wherein extracting features from each sub-region includes computing projection profiles in each sub-region.

29. The computer readable medium of claim 23, wherein the operations further comprise:
- determining a minimum size for text in candidate text regions;
  
  determining that a different second candidate text region is larger than the minimum size for candidate text regions; and
  
  in response to determining that the different second candidate text region is larger than the minimum size for candidate text regions, performing optical character recognition for the second candidate text region.

30. The computer readable medium of claim 23, wherein the operations further comprise:
- receiving ranging data associated with the image;
  
  generating a planar map of the image using the ranging data; and
  
  determining that the candidate text region of the image is not located on a single plane,wherein bypassing optical character recognition for the candidate text region is further based on determining that the candidate text region of the image is not located on a single plane.

31. The computer readable medium of claim 30, wherein the ranging data comprises a distance from a camera position when the image was taken to an object shown in the image.

32. The computer readable medium of claim 30, wherein generating a planar map comprises decomposing the image into planar and non-planar regions, and wherein determining that the candidate text region is not located on a single plane is based on determining that the candidate text region corresponds to a non-planar region.

33. The computer readable medium of claim 23, wherein the operations further comprise:
- correcting perspective distortion in the image.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Vincent, Luc, Ulges, Adrian
Primary Examiner(s)
Desire, Gregory M

Application Number

US13/620,944
Publication Number

US 20130039570A1
Time in Patent Office

626 Days
Field of Search

382/101, 382/159, 382/173, 382/224, 707/736
US Class Current

382/159
CPC Class Codes

G01S 19/51   Relative positioning

G06F 16/5846   using extracted text

G06T 11/60   Editing figures and text; C...

G06T 2207/30252   Vehicle exterior; Vicinity ...

G06T 3/4053   based on super-resolution, ...

G06T 7/11   Region-based segmentation

G06T 7/33   using feature-based methods

G06V 20/56   exterior to a vehicle by us...

G06V 20/63   Scene text, e.g. street names

G06V 30/10   Character recognition

Using extracted image text

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

54 Citations

33 Claims

Specification

Solutions

Use Cases

Quick Links

Using extracted image text

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

54 Citations

33 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links