Trie based method for indexing handwritten databases
First Claim
1. A method for matching an input sequence of continuously connected handwritten objects to one of a plurality of objects which are modeled by concatenating members of a set of component objects, the method comprising the steps of:
- (a) generating a Trie data structure representing the plurality of objects, the Trie data structure having a plurality of nodes divided into a plurality of levels, wherein each node includes a plurality of elements, including the steps of;
assigning component objects of each of the plurality of objects to respective elements of respective nodes of the Trie data structure,associating identifying characteristics of the respective component objects with each element in each node of the Trie data structure, wherein the identifying characteristics are numbers of local minima, local maxima and inflection points in each of the component objects, andassociating a respective hidden Markov model with each element of each node, the hidden Markov model representing the respective component object of the element;
(b) selecting a node of the Trie data structure;
(c) applying the input sequence of continuously connected handwritten objects to each of the hidden Markov models associated with the respective plurality of elements of the selected node to generate a respective plurality of acceptance values;
(d) identifying a segment of the input sequence with the element of the selected node that generates the acceptance value which is larger than any other one of the acceptance values, based on the identifying characteristics associated with the element, without interference from a portion of the input sequence that does not correspond to the component object assigned to the element; and
(e) deleting the identified segment from the input sequence of continuously connected handwritten objects.
3 Assignments
0 Petitions
Accused Products
Abstract
A method is disclosed for matching input data representing a continuous combination of input objects to a plurality of objects in a trie database structure. This data structure has a plurality of nodes partitioned into a plurality of levels. Each node in the Trie includes a plurality of elements where each element corresponds to a respective one of the component objects. In addition, a hidden Markov model corresponding to the component object is associated with the element in the database. According to the method, the input object is applied to each of the hidden Markov models associated with the respective plurality of elements of a node to generate a respective plurality of acceptance values. The element which generates the largest acceptance value is identified with a segment of the input data. The component object for this element is recorded and the identified segment is deleted from the input data string. These steps are repeated at successive levels of the Trie data structure until each segment of the input data has been identified with an element of a node of the Trie data structure. The input data is matched with the objects stored in the database by concatenating the component objects associated with the respective identified elements.
-
Citations
9 Claims
-
1. A method for matching an input sequence of continuously connected handwritten objects to one of a plurality of objects which are modeled by concatenating members of a set of component objects, the method comprising the steps of:
-
(a) generating a Trie data structure representing the plurality of objects, the Trie data structure having a plurality of nodes divided into a plurality of levels, wherein each node includes a plurality of elements, including the steps of; assigning component objects of each of the plurality of objects to respective elements of respective nodes of the Trie data structure, associating identifying characteristics of the respective component objects with each element in each node of the Trie data structure, wherein the identifying characteristics are numbers of local minima, local maxima and inflection points in each of the component objects, and associating a respective hidden Markov model with each element of each node, the hidden Markov model representing the respective component object of the element; (b) selecting a node of the Trie data structure; (c) applying the input sequence of continuously connected handwritten objects to each of the hidden Markov models associated with the respective plurality of elements of the selected node to generate a respective plurality of acceptance values; (d) identifying a segment of the input sequence with the element of the selected node that generates the acceptance value which is larger than any other one of the acceptance values, based on the identifying characteristics associated with the element, without interference from a portion of the input sequence that does not correspond to the component object assigned to the element; and (e) deleting the identified segment from the input sequence of continuously connected handwritten objects. - View Dependent Claims (2, 3, 4)
-
-
5. A method for matching an input sequence of continuously connected handwritten objects representing a plurality of handwritten letters to one of a plurality of words which are modeled by concatenating letters, the method comprising the steps of:
-
(a) generating a Trie data structure representing the plurality of words, the Trie data structure having a plurality of nodes divided into a plurality of levels, wherein each node includes a plurality of elements, including the steps of; assigning respective letters of each of the words to each of the respective elements of respective nodes of the Trie data structure, associating identifying characteristics of the respective letters with each element in each node of the Trie data structure, wherein the identifying characteristics are numbers of local minima, local maxima and inflection points in each of the letters, and associating a respective hidden Markov model with each element of each node, the hidden Markov model representing the respective letter of the element; (b) selecting a node of the Trie data structure; (c) applying the input sequence of continuously connected handwritten objects to each of the hidden Markov models associated with the respective plurality of elements of the selected node to generate a respective plurality of acceptance values; (d) identifying a segment of the input sequence with the element of the selected node that generates the acceptance value which is larger than any other one of the acceptance values, based on the identifying characteristics associated with the element, without interference from a portion of the input sequence adjacent to the segment, wherein the segment of the input sequence forms a single letter or a portion of a single letter and said portion represents a connecting stroke adjacent to the letters; and (e) deleting the identified segment from the input sequence. - View Dependent Claims (6)
-
-
7. A method of segmenting a continuous data stream representing a first object and a second object, where the first object and second object are modeled by respective first and second hidden Markov models, each having a plurality of states, the method comprising the steps of:
-
concatenating the first and second hidden Markov models to generate a combined hidden Markov model of the continuous data stream; identifying a first transition state of the combined hidden Markov model as representing a last state of the first hidden Markov model; identifying a second transition state of the combined hidden Markov model as representing a first state of the second hidden Markov model; applying the continuous data stream to the combined hidden Markov model to generate a state transition matrix; identifying, from the state transition matrix, a most probable state sequence which represents the continuous data stream, including the steps of; comparing each target state in the most probable state sequence with the second transition state to determine equality; if the target state equals the second transition state, comparing the state preceding the target state in the most probable state sequence to the first transition state; if the state preceding the target state is found to be equal to the first transition state, identifying a portion of the continuous data stream which corresponds to the target state as a segment boundary in the continuous data stream. - View Dependent Claims (8, 9)
-
Specification