Text image quality based feedback for improving OCR
First Claim
Patent Images
1. A method to improve text recognition by using multiple images of identical text, the method comprising:
- capturing a plurality of images of a scene of real world at a plurality of zoom levels, said scene of real world containing text of one or more sizes;
extracting from each of the plurality of images, one or more text regions;
analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of said plurality of images;
when the attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, providing the version of the first text region as input to OCR;
when the value of the attribute does not meet the limit of OCR, calculating a new zoom level at which the attribute of the first text region meets the limit of OCR, and storing at least an identification of the first text region in a list;
repeating the providing or the calculating, with other text regions extracted from the plurality of images;
using the list to identify a maximum zoom level that retains all text regions in the list within a field of view of a camera; and
based on the maximum zoom level, generating feedback to capture at least one additional image.
1 Assignment
0 Petitions
Accused Products
Abstract
An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.
30 Citations
28 Claims
-
1. A method to improve text recognition by using multiple images of identical text, the method comprising:
-
capturing a plurality of images of a scene of real world at a plurality of zoom levels, said scene of real world containing text of one or more sizes; extracting from each of the plurality of images, one or more text regions; analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of said plurality of images; when the attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, providing the version of the first text region as input to OCR; when the value of the attribute does not meet the limit of OCR, calculating a new zoom level at which the attribute of the first text region meets the limit of OCR, and storing at least an identification of the first text region in a list; repeating the providing or the calculating, with other text regions extracted from the plurality of images; using the list to identify a maximum zoom level that retains all text regions in the list within a field of view of a camera; and based on the maximum zoom level, generating feedback to capture at least one additional image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. At least one non-transitory computer readable storage media comprising a plurality of instructions to be executed by at least one processor to obtain multiple images for use in text recognition, the plurality of instructions comprising:
-
first instructions to capture a plurality of images of a scene of real world at a plurality of zoom levels, said scene of real world containing text of one or more sizes; second instructions to extract from each of the plurality of images, one or more text regions; third instructions to analyze an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of said plurality of images; fourth instructions to provide the version of the first text region as input to OCR, when the attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region; fifth instructions to calculate a new zoom level at which the attribute of the first text region meets the limit of OCR and store at least an identification of the first text region in a list, when the value of the attribute not meeting the limit of OCR; sixth instructions to repeatedly execute the fourth instructions and the fifth instructions, with other text regions extracted from the plurality of images; seventh instructions to use the list to identify a maximum zoom level that retains all text regions in the list within a field of view of a camera; and based on the maximum zoom level, eighth instructions to generate feedback to capture at least one additional image. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A mobile device to decode text in real world images, the mobile device comprising:
-
a camera; a memory operatively connected to the camera to receive at least an image therefrom, the image comprising one or more text regions; at least one processor operatively connected to the memory to execute a plurality of instructions stored in the memory; wherein the plurality of instructions cause the at least one processor to; capture a plurality of images of a scene of real world at a plurality of zoom levels, said scene of real world containing text of one or more sizes; extract from each of the plurality of images, one or more text regions; analyze an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of said plurality of images; and when the attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, provide the version of the first text region as input to OCR; when the value of the attribute does not meet the limit of OCR, calculate a new zoom level at which the attribute of the first text region meets the limit of OCR, and storing at least an identification of the first text region in a list; repeat execution of instructions to provide or instructions to calculate, with other text regions extracted from the plurality of images; use the list to identify a maximum zoom level that retains all text regions in the list within a field of view of a camera; and based on the maximum zoom level, generate feedback to capture at least one additional image. - View Dependent Claims (23, 24, 25, 26, 27)
-
-
28. A mobile device comprising:
-
a camera configured to capture a plurality of images of a scene of real world at a plurality of zoom levels, said scene of real world containing text of one or more sizes; a memory coupled to the camera for storing the plurality of images; means, coupled to the memory, for extracting from each of the plurality of images, one or more text regions; means for analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of said plurality of images; responsive to the attribute having a value that meets a limit of optical character recognition (OCR) in a version of the first text region, means for providing the version of the first text region as input to OCR; responsive to the value of the attribute not meeting the limit of OCR, means for calculating a new zoom level at which the attribute of the first text region meets the limit of OCR, and storing at least an identification of the first text region in a list; means for repeatedly invoking the means for providing or the means for calculating, with other text regions extracted from the plurality of images; means for using the list to identify a maximum zoom level that retains all text regions in the list within a field of view of a camera; and based on the maximum zoom level, means for generating feedback to capture at least one additional image.
-
Specification