METHOD FOR PROCESSING OPTICAL CHARACTER RECOGNIZER OUTPUT
First Claim
Patent Images
1. A computer implemented method for processing an output of an optical character recognizer (OCR), the computer implemented method comprising:
- receiving a first character sequence from the OCR; and
converting a first set of characters from the first character sequence to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores generated by a language model.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, a system, and a computer program product for processing the output of an OCR are disclosed. The system receives a first character sequence from the OCR. A first set of characters from the first character sequence are converted to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores.
29 Citations
20 Claims
-
1. A computer implemented method for processing an output of an optical character recognizer (OCR), the computer implemented method comprising:
-
receiving a first character sequence from the OCR; and converting a first set of characters from the first character sequence to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores generated by a language model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer implemented method for processing an output of an optical character recognizer (OCR), the computer implemented method comprising:
-
receiving a first character sequence from the OCR, converting a first set of characters from the first character sequence to a corresponding second set of characters to generate a second character sequence based on one or more finite state transducers (FSTs) corresponding to each character of the first character sequence and language scores generated by a language model, wherein weights associated with the one or more FSTs and the language model are determined using a Minimum Error Rate Training (MERT) technique.
-
-
11. A computer implemented method for language translation comprising:
-
receiving a first character sequence from an optical character recognizer (OCR), the first character sequence being in a first language; converting a first set of characters from the first character sequence to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores generated by a language model, wherein the second character sequence is in the first language; and translating a first word sequence corresponding to the second character sequence to a second word sequence in a second language.
-
-
12. A system for processing an output of an optical character recognizer (OCR), the system comprising a processor coupled to a memory, the memory having stored therein one or more program modules comprising:
a conversion module configured for converting a first set of characters in a first character sequence to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores, wherein the first character sequence is received from the OCR. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
20. A system for language translation, the system comprising a processor coupled to a memory, the memory having stored therein one or more program modules comprising:
-
a conversion module configured for converting a first set of characters in a first character sequence to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores generated by a language model, the first character sequence being received from an Optical Character Recognizer (OCR), wherein the first character sequence and the second character sequence are in a first language; and a translation module configured for translating a first word sequence corresponding to the second character sequence to a second word sequence in a second language.
-
Specification