Hybrid optical character recognition
First Claim
1. A computer-implemented method comprising:
- generating an image at a mobile computing device using a camera;
determining features corresponding to a first word and a second word of text in the image;
generating, on the mobile computing device, mobile optical character recognition (OCR) data including mobile OCR results associated with the first word by performing OCR on the features associated with the first word;
determining an OCR latency of the mobile computing device;
determining an OCR accuracy of the mobile computing device;
sending the image to a remote device to perform remote OCR on the image;
causing the mobile OCR results to be displayed on the mobile computing device, including causing a first textual output associated with the first word to be displayed;
receiving remote OCR data, wherein the remote OCR data includes remote OCR results from the remote device associated with the second word;
determining differences between the mobile OCR data and the remote OCR data;
generating hybrid OCR results based on the differences by merging the mobile OCR data and the remote OCR data, including generating hybrid OCR results that include a second textual output associated with the second word and wherein the generating occurs based on at least one of the OCR latency being less than a threshold amount of time or the OCR accuracy being less than an accuracy threshold; and
causing the hybrid OCR results to be displayed on the mobile computing device, including causing the first textual output to be displayed with the second textual output.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the subject technology provide for a hybrid OCR approach which combines server and device side processing that can offset disadvantages of performing OCR solely on the server side or the device side. More specifically, the subject technology utilizes image characteristics such as glyph details and image quality measurements to opportunistically schedule OCR processing on the mobile device and/or server. In this regard, text extracted by a “faster” OCR engine (e.g., one with less latency) is displayed to a user, which is then updated by the result of a more accurate OCR engine (e.g., an OCR engine provided by the server). This approach allows factoring in additional parameters such as network latency and user preference for making scheduling decisions. Thus, the subject technology may provide significant gains in terms of reduced latency and increased accuracy by implementing one or more techniques associated with this hybrid OCR approach.
14 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
generating an image at a mobile computing device using a camera; determining features corresponding to a first word and a second word of text in the image; generating, on the mobile computing device, mobile optical character recognition (OCR) data including mobile OCR results associated with the first word by performing OCR on the features associated with the first word; determining an OCR latency of the mobile computing device; determining an OCR accuracy of the mobile computing device; sending the image to a remote device to perform remote OCR on the image; causing the mobile OCR results to be displayed on the mobile computing device, including causing a first textual output associated with the first word to be displayed; receiving remote OCR data, wherein the remote OCR data includes remote OCR results from the remote device associated with the second word; determining differences between the mobile OCR data and the remote OCR data; generating hybrid OCR results based on the differences by merging the mobile OCR data and the remote OCR data, including generating hybrid OCR results that include a second textual output associated with the second word and wherein the generating occurs based on at least one of the OCR latency being less than a threshold amount of time or the OCR accuracy being less than an accuracy threshold; and causing the hybrid OCR results to be displayed on the mobile computing device, including causing the first textual output to be displayed with the second textual output. - View Dependent Claims (2, 3)
-
-
4. A system comprising:
-
at least one processor; and a memory device including instructions that, when executed by the at least one processor, cause the at least one processor to; receive an image at the system; determine an OCR latency of the system; determine an OCR accuracy of the system; send the image to a remote device to perform remote optical character recognition (OCR) on the image; generate mobile OCR data, wherein the mobile OCR data comprises mobile OCR results; receive remote OCR data, wherein the remote OCR data includes remote OCR results from the remote device; determining differences between the mobile OCR data and the remote OCR data; generate hybrid OCR results based at least in part on; the differences, the mobile OCR data, the remote OCR data, and at least one of the OCR latency being less than a threshold amount of time or the OCR accuracy being less than an accuracy threshold; and cause the hybrid OCR results to be displayed. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium including instructions stored therein that, when executed by at least one computing device, cause the at least one computing device to:
-
receive an image at the at least one computing device; determine an OCR latency of the at least one computing device; determine an OCR accuracy of the at least one computing device; send the image to a remote device to perform remote optical character recognition (OCR) on the image; generate mobile OCR data, wherein the mobile OCR data comprises mobile OCR results; receive remote OCR data, wherein the remote OCR data includes remote OCR results from the remote device; determining differences between the mobile OCR data and the remote OCR data; generate hybrid OCR results based at least in part on; the differences, the mobile OCR data, the remote OCR data, and at least one of the OCR latency being less than a threshold amount of time or the OCR accuracy being less than an accuracy threshold; and cause the hybrid OCR results to be displayed. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification