Indexing Structured Documents
First Claim
Patent Images
1. A system comprising:
- a user interface device; and
one or more server computers operable to interact with the user interface device and to;
apply a pre-defined rule set to a plurality of versions of a first document in a plurality of structured documents to extract one or more index values, the pre-defined rule set including a plurality of rules, each rule having a distinct rule identifier, each extracted index value being extracted by a rule in the pre-defined rule set, wherein one or more versions of the first document is concurrently accessible to a plurality of users for collaborative authoring; and
for each extracted index value, store in an index-value data structure the extracted index-value, the rule identifier of the rule that extracted the index value, and information identifying the first document and the respective version of the first document from which the index-value was extracted.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus, including computer program products, for indexing structured documents. A method includes identifying a structured document in a file system for indexing, the structured document having an identifier and at least one indexing-property, extracting at least one index-value from the structured document in accordance with a pre-defined extraction rule set and storing the at least one index-value with the identifier in an index-value data structure.
-
Citations
23 Claims
-
1. A system comprising:
-
a user interface device; and
one or more server computers operable to interact with the user interface device and to;
apply a pre-defined rule set to a plurality of versions of a first document in a plurality of structured documents to extract one or more index values, the pre-defined rule set including a plurality of rules, each rule having a distinct rule identifier, each extracted index value being extracted by a rule in the pre-defined rule set, wherein one or more versions of the first document is concurrently accessible to a plurality of users for collaborative authoring; and
for each extracted index value, store in an index-value data structure the extracted index-value, the rule identifier of the rule that extracted the index value, and information identifying the first document and the respective version of the first document from which the index-value was extracted. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a user interface device; and
one or more server computers operable to interact with the user interface device and to;
apply a pre-defined rule set to each indexable document in a plurality of structured documents, including to apply the pre-defined rule set to a plurality of versions of an indexable document in the plurality of structured documents, to extract one or more index values, the pre-defined rule set including a plurality of rules, each rule having a distinct rule identifier, each extracted index value being extracted by a rule in the pre-defined rule set, wherein one or more versions of the indexable document is concurrently accessible to a plurality of users for collaborative authoring; and
for each extracted index value, store in an index-value data structure the extracted index-value, the rule identifier of the rule that extracted the index value, and information identifying the respective indexable document and a respective version of the respective indexable document from which the index-value was extracted. - View Dependent Claims (13, 14, 15)
-
-
16. An article comprising:
a storage medium having stored thereon instructions that when executed by a server computer result in the following;
applying a pre-defined rule set to each indexable document in a plurality of structured documents, including applying the pre-defined rule set to a plurality of versions of an indexable document in the plurality of structured documents, to extract one or more index values, the pre-defined rule set including a plurality of rules, each rule having a distinct rule identifier, each extracted index value being extracted by a rule in the pre-defined rule set, wherein one or more versions of the indexable document is concurrently accessible to a plurality of users for collaborative authoring; and
for each extracted index value, storing in an index-value data structure the extracted index-value, the rule identifier of the rule that extracted the index value, and information identifying the respective indexable document and a respective version of the respective indexable document from which the index-value was extracted. - View Dependent Claims (17, 18, 19)
-
20. A computer program product, tangibly stored on a machine readable medium, for indexing structured documents, comprising instructions operable to cause a server computer to:
-
apply a pre-defined rule set to each indexable document in a plurality of structured documents, including instructions operable to cause the server computer to apply the pre-defined rule set to a plurality of versions of an indexable document in the plurality of structured documents, to extract one or more index values, the pre-defined rule set including a plurality of rules, each rule having a distinct rule identifier, each extracted index value being extracted by a rule in the pre-defined rule set, wherein one or more versions of the indexable document is concurrently accessible to a plurality of users for collaborative authoring; and
for each extracted index value, store in an index-value data structure the extracted index-value, the rule identifier of the rule that extracted the index value, and information identifying the respective indexable document and a respective version of the respective indexable document from which the index-value was extracted. - View Dependent Claims (21, 22, 23)
-
Specification