METHOD AND SYSTEM FOR THE SPOTTING OF ARBITRARY WORDS IN HANDWRITTEN DOCUMENTS
First Claim
1. A method for the spotting of keywords in a handwritten document, comprising the steps of:
- inputting an image of the handwritten document;
performing word segmentation on the image to obtain segmented words;
performing word matching, consisting in the sub-steps of;
performing character segmentation on the segmented words;
performing character recognition on the segmented characters;
performing distance computations on the recognized characters using a Generalized Hidden Markov Model with ergodic topology to identify words based on character models;
performing non-keyword rejection using a classifier based on a combination of Gaussian Mixture Models, Hidden Markov Models and Support Vector Machines;
outputting the spotted keywords.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for the spotting of keywords in a handwritten document, the method comprising the steps of inputting an image of the handwritten document, performing word segmentation on the image to obtain segmented words, performing word matching, and outputting the spotted keywords. The word matching itself consisting in the sub-steps of performing character segmentation on the segmented words, performing character recognition on the segmented characters, performing distance computations on the recognized characters using a Generalized Hidden Markov Model with ergodic topology to identify words based on character models and performing non-keyword rejection using a classifier based on a combination of Gaussian Mixture Models, Hidden Markov Models and Support Vector Machines.
36 Citations
25 Claims
-
1. A method for the spotting of keywords in a handwritten document, comprising the steps of:
-
inputting an image of the handwritten document; performing word segmentation on the image to obtain segmented words; performing word matching, consisting in the sub-steps of; performing character segmentation on the segmented words; performing character recognition on the segmented characters; performing distance computations on the recognized characters using a Generalized Hidden Markov Model with ergodic topology to identify words based on character models; performing non-keyword rejection using a classifier based on a combination of Gaussian Mixture Models, Hidden Markov Models and Support Vector Machines; outputting the spotted keywords. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21)
-
-
13-14. -14. (canceled)
-
22. A method for improving word segmentation of a document, comprising the steps of:
-
obtaining extracted text lines from an image of the document; generating word segmentation hypotheses for each of the extracted text lines using one of a Markov Chain and a Hidden Markov Model; performing a threshold selection on the word segmentation hypotheses using a segmentation threshold; selecting the most likely word segmentation hypotheses based on the segmentation threshold; providing the segmented words. - View Dependent Claims (24)
-
-
23. (canceled)
-
25-54. -54. (canceled)
Specification