On-screen guideline-based selective text recognition
First Claim
1. A computer-implemented method for selectively recognizing text in a live video stream, comprising:
- receiving a video frame from a camera in real time;
displaying a guideline overlaid on the video frame on a display device;
identifying a text region in the video frame associated with the guideline, the text region comprising text; and
converting the text in the text region into an editable symbolic form, the converting comprising;
identifying a candidate language for a line of text in the text region based at least in part on an orientation of the line of text;
using OCR functions associated with the candidate language to determine a plurality of candidate texts in the editable symbolic form;
displaying the plurality of candidate texts;
receiving a user selection of one of the plurality of candidate texts; and
identifying the selected candidate text as the converted text for the text region.
2 Assignments
0 Petitions
Accused Products
Abstract
A live video stream captured by an on-device camera is displayed on a screen with an overlaid guideline. Video frames of the live video stream are analyzed for a video frame with acceptable quality. A text region is identified in the video frame approximate to the on-screen guideline and cropped from the video frame. The cropped image is transmitted to an optical character recognition (OCR) engine, which processes the cropped image and generates text in an editable symbolic form (the OCR'"'"'ed text). A confidence score is determined for the OCR'"'"'ed text and compared with a threshold value. If the confidence score exceeds the threshold value, the OCR'"'"'ed text is outputted.
69 Citations
25 Claims
-
1. A computer-implemented method for selectively recognizing text in a live video stream, comprising:
-
receiving a video frame from a camera in real time; displaying a guideline overlaid on the video frame on a display device; identifying a text region in the video frame associated with the guideline, the text region comprising text; and converting the text in the text region into an editable symbolic form, the converting comprising; identifying a candidate language for a line of text in the text region based at least in part on an orientation of the line of text; using OCR functions associated with the candidate language to determine a plurality of candidate texts in the editable symbolic form; displaying the plurality of candidate texts; receiving a user selection of one of the plurality of candidate texts; and identifying the selected candidate text as the converted text for the text region. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 25)
-
-
15. A non-transitory computer-readable storage medium encoded with executable computer program code for selectively recognizing text in a live video stream, the computer program code comprising program code for:
-
receiving a video frame from a camera in real time; displaying a guideline overlaid on the video frame on a display device; identifying a text region in the video frame associated with the guideline, the text region comprising text; and converting the text in the text region into an editable symbolic form, the converting comprising; identifying a candidate language for a line of text in the text region based at least in part on an orientation of the line of text; using OCR functions associated with the candidate language to determine a plurality of candidate texts in the editable symbolic form; displaying the plurality of candidate texts; receiving a user selection of one of the plurality of candidate texts; and identifying the selected candidate text as the converted text for the text region.
-
-
16. A computer system for selectively recognizing text in a live video stream, comprising:
a computer-readable storage medium comprising executable computer program code for; a video User Interface (UI) module for receiving a video frame from a camera in real time and displaying a guideline overlaid on the video frame on a display device; a text region identification module for identifying a text region in the video frame associated with the guideline, the text region comprising text; and an OCR module for; converting the text in the text region into an editable symbolic form, the converting comprising; identifying a candidate language for a line of text in the text region based at least in part on an orientation of the line of text; using OCR functions associated with the candidate language to determine a plurality of candidate texts in the editable symbolic form; displaying the plurality of candidate texts; receiving a user selection of one of the plurality of candidate texts; and identifying the selected candidate text as the converted text for the text region.
-
17. A computer-implemented method for converting text in a series of received images into text in an editable symbolic form, comprising:
-
receiving a series of images from a client, the series of images comprising a first image; processing the first image using OCR functions to generate text in the editable symbolic form; determining whether the generated text includes a spelling error; determining a confidence score for the generated text based on text generated for other images in the series of images received from the client, the confidence score being higher if the generated text does not include a spelling error than if the generated text includes a spelling error; and responsive to the confidence score exceeding a threshold value, transmitting the generated text to the client. - View Dependent Claims (18, 19, 20, 21, 22)
-
-
23. A non-transitory computer-readable storage medium encoded with executable computer program code for converting text in a series of received images into text in an editable symbolic form, the computer program code comprising program code for:
-
receiving a series of images from a client, the series of images comprising a first image; identifying a candidate language for a line of text in the first image based at least in part on an orientation of the line of text; processing the first image using OCR functions associated with the candidate language to generate text in the editable symbolic form; determining a confidence score for the generated text based on text generated for other images in the series of images received from the client; and responsive to the confidence score exceeding a threshold value, transmitting the generated text to the client in response to the series of images.
-
-
24. A computer system for converting text in a series of received images into text in an editable symbolic form, comprising:
a computer-readable storage medium comprising executable computer program code for; an OCR engine for; receiving a series of images from a client, the series of images comprising a first image, and processing the first image using OCR functions to generate a plurality of candidate texts in the editable symbolic form; and a confidence evaluation module for; determining a confidence score for each of the plurality of generated candidate texts based on text generated for other images in the series of images received from the client to quantify a confidence of the candidate text matching text in the first image, and transmitting ones of the generated candidate texts to the client in response to the confidence scores of the ones of the candidate texts exceeding a threshold value.
Specification