System and method for storage, management and automatic indexing of structured documents
First Claim
1. A structured document storage and management system, comprising:
- a data store having one or more computer useable storage media;
a generic document model tree in said data store representing a structured document model, said generic document model tree containing one or more structured document nodes without storing node data that is unique to any particular structured document;
a symbol conversion module in said data store containing untagged data associated with said one or more structured document nodes, said untagged data representing node data for one or more particular structured documents conforming to said structured document model, said symbol conversion module also maintaining a value code in association with each element of said node data; and
an encoded vector set in said data store comprising a set of encoded vectors, each encoded vector being assigned to one of said structured document nodes of said generic document model tree, each encoded vector being comprised of a set of vector elements that each store a copy of one of said symbol conversion module value codes for said structured document node to which said encoded vector is assigned, each value code being stored at an index position in said encoded vector that corresponds to a particular structured document that conforms to said generic document model tree;
whereby said encoded vector set represents encoded document storage for a set of structured documents that conform to said generic document model tree, with said node data for each node of a structured document being determinable by consulting each of said encoded vectors at an index position corresponding to said structured document and using said value code stored at said index position to consult said symbol conversion module and identify said node data that is associated with said value code.
2 Assignments
0 Petitions
Accused Products
Abstract
A structured document storage and management technique utilizes a generic document model tree, a symbol conversion module and an encoded vector set to store structured documents. The generic document model tree represents a structured document model and contains one or more structured document nodes without storing node data unique to any particular structured document. The symbol conversion module contains untagged data associated with the one or more structured document nodes, and representing node data for particular structured documents. The symbol conversion module also maintains a value code in association with each untagged data element. The encoded vector set includes one or more encoded vectors corresponding to the one or more structured document nodes having associated untagged data. Each encoded vector contains one of the value codes at an index position that corresponds to a particular structured document. The disclosed technique allows structured documents to be efficiently stored, organized, and searched.
-
Citations
20 Claims
-
1. A structured document storage and management system, comprising:
-
a data store having one or more computer useable storage media; a generic document model tree in said data store representing a structured document model, said generic document model tree containing one or more structured document nodes without storing node data that is unique to any particular structured document; a symbol conversion module in said data store containing untagged data associated with said one or more structured document nodes, said untagged data representing node data for one or more particular structured documents conforming to said structured document model, said symbol conversion module also maintaining a value code in association with each element of said node data; and an encoded vector set in said data store comprising a set of encoded vectors, each encoded vector being assigned to one of said structured document nodes of said generic document model tree, each encoded vector being comprised of a set of vector elements that each store a copy of one of said symbol conversion module value codes for said structured document node to which said encoded vector is assigned, each value code being stored at an index position in said encoded vector that corresponds to a particular structured document that conforms to said generic document model tree; whereby said encoded vector set represents encoded document storage for a set of structured documents that conform to said generic document model tree, with said node data for each node of a structured document being determinable by consulting each of said encoded vectors at an index position corresponding to said structured document and using said value code stored at said index position to consult said symbol conversion module and identify said node data that is associated with said value code. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A structured document storage and management method, comprising:
-
establishing the following in a data store having one or more computer useable storage media; a generic document model tree representing a structured document model, said generic document model tree containing one or more structured document nodes without storing node data that is unique to any particular structured document; a symbol conversion module containing untagged data associated with said one or more structured document nodes, said untagged data representing node data for one or more structured documents conforming to said structured document model, said symbol conversion module also maintaining a value code in association with each element of said node data; and an encoded vector set comprising a set of encoded vectors, each encoded vector being assigned to one of said structured document nodes of said generic document model tree, each encoded vector being comprised of a set of vector elements that each store a copy of one of said symbol conversion module value codes for said structured document node to which said encoded vector is assigned, each value code being stored at an index position in said encoded vector that corresponds to a particular structured document that conforms to said generic document model tree; whereby said encoded vector set represents encoded document storage for a set of structured documents that conform to said generic document model tree, with said node data for each node of a structured document being determinable by consulting each of said encoded vectors at an index position corresponding to said structured document and using said value code stored at said index position to consult said symbol conversion module and identify said node data that is associated with said value code. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product, comprising:
-
one or more computer useable storage media; program logic stored on said computer useable storage media for programming a data processing platform to implement structured document storage and management, as by; establishing a generic document model tree representing a structured document model, said generic document model tree containing one or more structured document nodes without storing node data that is unique to any particular structured document; establishing a symbol conversion module containing untagged data associated with said one or more structured document nodes, said untagged data representing node data for one or more particular structured documents conforming to said structured document model, said symbol conversion module also maintaining a value code in association with each element of node data; and establishing an encoded vector set comprising a set of encoded vectors, each encoded vector being assigned to one of said structured document nodes of said generic document model tree, each encoded vector being comprised of a set of vector elements that each store a copy of one of said symbol conversion module value codes for said structured document node to which said encoded vector is assigned, each value code being stored at an index position in said encoded vector that corresponds to a particular structured document that conforms to said generic document model tree; whereby said encoded vector set represents encoded document storage for a set of structured documents that conform to said generic document model tree, with said node data for each node of a structured document being determinable by consulting each of said encoded vectors at an index position corresponding to said structured document and using said value code stored at said index position to consult said symbol conversion module and identify said node data that is associated with said value code. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A computer program product, comprising:
-
one or more computer useable storage media; program logic stored on said computer useable storage media for programming a data processing platform to implement structured document storage and management, as by; establishing a generic document model tree representing a structured document model, said generic document model tree containing one or more structured document nodes without storing node data that is unique to any particular structured document; establishing a symbol conversion module containing untagged data associated with said one or more structured document nodes, said untagged data representing node data for one or more particular structured documents conforming to said structured document model, said symbol conversion module also maintaining a value code in association with each element of said node data; establishing an encoded vector set comprising a set of encoded vectors, each encoded vector being assigned to one of said structured document nodes of said generic document model tree, each encoded vector being comprised of a set of vector elements that each store a copy of one of said symbol conversion module value codes for said structured document node to which said encoded vector is assigned, each value code being stored at an index position in said encoded vector that corresponds to a particular structured document that conforms to said generic document model tree; and linking said one or more structured document nodes having associated untagged data to a corresponding one of said encoded vectors; whereby said encoded vector set represents encoded document storage for a set of structured documents that conform to said generic document model tree, with said node data for each node of a structured document being determinable by consulting each of said encoded vectors at an index position corresponding to said structured document and using said value code stored at said index position to consult said symbol conversion module and identify said node data that is associated with said value code. - View Dependent Claims (20)
-
Specification