Method and apparatus for storage and retrieval of handwritten information
First Claim
1. A method for generating and querying an indexed database stored in a computer system, comprising the steps of:
- (a) establishing a database which includes a plurality of data objects each defined by a respective tuple of attributes, the attributes including at least one attribute having a domain of values that includes handwritten objects, each handwritten object including a plurality of symbols ordered in an output sequence;
each symbol being a feature vector;
(b) establishing an index having a root node and a plurality of leaf nodes, each connected to the root node by a respective path, such that each path from the root node to one of the plurality of leaf nodes corresponds to a respective input sequence of symbols, for which input sequence the respective leaf node includes a set of pointers to a subset of the tuples;
(c) analyzing the output sequence of each handwritten object and determining a respective probability that each input sequence matches the output sequence;
(d) including, in the respective set of pointers in each respective leaf node, a pointer to any tuple for which the respective output sequence has at least a threshold probability of matching the input further plurality of symbols ordered in an input sequence and traversing one of the paths, the one path corresponding to a sequence of symbols which matches the respective input sequence of symbols of the handwritten input object.
3 Assignments
0 Petitions
Accused Products
Abstract
A method is provided for generating an indexed database stored in a computer system. A database is established. The database includes a plurality of data objects. Each data object is defined by a respective tuple of attributes. The attributes include at least one attribute having a domain of values that includes handwritten objects. Each handwritten object includes a plurality of symbols ordered in an output sequence. An index is established, having a root node and a plurality of leaf nodes. Each leaf node is connected to the root node by a respective path, such that each path from the root node to one of the plurality of leaf nodes corresponds to a respective input sequence of symbols. The input sequence for the respective leaf node includes a set of pointers to a subset of the tuples. A respective Hidden Markov Model (HMM) is executed to analyze the output sequence of each handwritten object and to determine a respective probability that each input sequence matches the output sequence. A pointer to any tuple for which the respective output sequence has at least a threshold probability of matching the input sequence (corresponding to the leaf node) is included in the respective set of pointers in each respective leaf node. The probability is determined by the respective HMM for the output sequence of each handwritten object.
-
Citations
11 Claims
-
1. A method for generating and querying an indexed database stored in a computer system, comprising the steps of:
-
(a) establishing a database which includes a plurality of data objects each defined by a respective tuple of attributes, the attributes including at least one attribute having a domain of values that includes handwritten objects, each handwritten object including a plurality of symbols ordered in an output sequence;
each symbol being a feature vector;(b) establishing an index having a root node and a plurality of leaf nodes, each connected to the root node by a respective path, such that each path from the root node to one of the plurality of leaf nodes corresponds to a respective input sequence of symbols, for which input sequence the respective leaf node includes a set of pointers to a subset of the tuples; (c) analyzing the output sequence of each handwritten object and determining a respective probability that each input sequence matches the output sequence; (d) including, in the respective set of pointers in each respective leaf node, a pointer to any tuple for which the respective output sequence has at least a threshold probability of matching the input further plurality of symbols ordered in an input sequence and traversing one of the paths, the one path corresponding to a sequence of symbols which matches the respective input sequence of symbols of the handwritten input object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for generating and querying an indexed database stored in a computer system, comprising the steps of:
-
(a) establishing a database which includes a plurality of data objects each defined by a respective tuple of attributes, the attributes including at least one attribute having a domain of values that includes handwritten objects, each handwritten object including a plurlaity of symbols ordered in an output sequence, each symbol being a feature vector;
for which input sequence the respective leaf node includes a respective set of pointers to a subset of the tuples;(c) analyzing the output sequence of each handwritten object and determining a respective probability that each input sequence matches the output sequence; (d) including, in the respective set of pointers in each respective leaf node, a pointer to any tuple for which the respective output sequence has at least a threshold probability of matching the input sequence corresponding to the leaf node, as determined for the output sequence of each handwritten object; and (e) querying the database by inputting a handwritten input object to the computer system, said querying comprising the steps of, (1) executing a first query matching algorithm to generate an approximate response to the query, (2) displaying the approximate response, (3) executing a second query matching algorithm after executing the first query matching algorithm, to generate a second response to the query that is more accurate than the approximate response, and (4) displaying the second response.
-
-
10. A method for indexing and querying a database, comprising the steps of:
-
(a) establishing a database which includes a plurality of data objects defined by tuples of attributes, the attributes including at least one attribute having a domain of values that includes handwritten objects, each handwritten object including a plurality of output symbols ordered in an output sequence, each output symbol being a feature vector; (b) modelling each handwritten object by a common alphabet including n output symbols, and a common output sequence length of T symbols, where n and T are integers; (c) establishing an index for the one attribute, the index having T levels, each level between zero and T-1 having a respective minimum probability value, each level having at least one node; (d) indexing the symbols in one of the handwritten objects by performing, for each node in one of the levels of the index, the steps of; (1) determining a probability that a symbol stored in said node represents a corresponding output symbol in the one handwritten object, (2) adding a node in a next level of the index, if the probability determined in step (d) (1) exceeds the minimum probability value of the one level and the next level is between the oneth level and the T-1th level, (3) executing step (d) for the added node in the next level, if the node is added in step (d) (2), and (4) adding a pointer to the one handwritten object in a list of pointers stored in a node in the Tth level of the index, if the next level is the Tth level and the probability determined in step (d) (1) is greater than the minimum probability value of the T-1th level; (e) repeating step (d), for each of the plurality of handwritten objects other than the one handwritten object; and (f) querying the database by inputting to the computer system a handwritten input objects including a further plurality of symbols ordered in an input sequence and matching the respective input sequence of symbols of the handwritten input object to the output sequence of output symbols included in one of said handwritten objects.
-
-
11. A method for generating and querying a database stored in a computer system, comprising the steps of:
-
(a) establishing a database which includes a plurality of data objects each defined by a respective tuple of attributes, the attributes including at least one attribute having a domain of values that includes handwritten objects, each handwritten object including a probability that each input sequence matches the output sequence; (c) including in an index of the database a pointer to any tuple for which the respective output sequence has at least a threshold probability of matching the input sequence, as determined for the output sequence of each handwritten object; and (d) querying the database by inputting to the computer system a handwritten input object including a further plurality of symbols ordered in an input sequence and matching the respective inputs sequence of symbols of the handwritten input object to the output sequence of output symbols included in said handwritten objects.
-
Specification