HANDWRITTEN WORD SPOTTER USING SYNTHESIZED TYPED QUERIES
First Claim
1. A method comprising:
- receiving a query string;
generating at least one computer-generated image based on the query string;
training a model based on the at least one computer generated image;
scoring candidate handwritten word images of a collection of handwritten word images using the trained model; and
based on the scores, identifying a subset of the word images.
7 Assignments
0 Petitions
Accused Products
Abstract
A wordspotting system and method are disclosed for processing candidate word images extracted from handwritten documents. In response to a user inputting a selected query string, such as a word to be searched in one or more of the handwritten documents, the system automatically generates at least one computer-generated image based on the query string in a selected font or fonts. A model is trained on the computer-generated image(s) and is thereafter used in the scoring the candidate handwritten word images. The candidate or candidates with the highest scores and/or documents containing them can be presented to the user, tagged, or otherwise processed differently from other candidate word images/documents.
40 Citations
25 Claims
-
1. A method comprising:
-
receiving a query string; generating at least one computer-generated image based on the query string; training a model based on the at least one computer generated image; scoring candidate handwritten word images of a collection of handwritten word images using the trained model; and based on the scores, identifying a subset of the word images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer implemented processing system comprising:
-
a synthesizer which synthesizes at least one computer-generated image based on a received query string; a model which is trained on features extracted from the at least one computer-generated image; a scoring component which scores candidate handwritten word images of a collection of candidate handwritten word images against the model and, based on the scores, identifies a subset of the handwritten word images. - View Dependent Claims (20, 21, 22, 23)
-
-
24. A computer implemented method for wordspotting comprising:
-
receiving a query string to be searched for in a collection of candidate handwritten word images extracted from one or more documents; for each of a set of fonts, automatically generating an image based on the query string; modeling the query string with a semi-continuous hidden Markov model, a subset of the parameters of the semi-continuous hidden Markov model being estimated based on features extracted from the images in the different fonts, and other parameters of the semi-continuous hidden Markov model being previously trained on sample handwritten word images without consideration of the query string; scoring candidate handwritten word images of the collection against the trained semi-continuous hidden Markov model; and based on the scoring, labeling one or more of the candidate handwritten word images, or a document containing one or more of the candidate handwritten word images. - View Dependent Claims (25)
-
Specification