Method, apparatus, and computer program product for searching structured document
First Claim
Patent Images
1. An apparatus for searching a structured document, comprising:
- a structured-document storing unit that stores therein structured-document information having a hierarchized logical structure, the structured-document information including an object corresponding to a structure element and an object ID for identifying the object, the structure element being a unit of the logical structure and identified by a structure ID;
a structure-index storing unit that stores therein a structure index in which the structure ID is associated with the object ID;
a vocabulary-index storing unit that stores therein a vocabulary index in which a vocabulary ID for identifying a vocabulary included in the structured-document information is associated with the object ID;
a structure-information storing unit that stores therein structure information on the structure element, the structure information including discrimination information indicating whether the vocabulary index is attached to the structure ID;
a condition generating unit that associates a search key included in an input search condition with the structure ID that is a search target of the search key, generates a hierarchical-type search condition including, as a unit of a hierarchical structure, a search target structure ID that is a structure ID corresponding to the search key and a search result structure ID that is a structure ID to be acquired as a search result for the search condition, the hierarchical-type search condition defining a structure constraint regarding the hierarchical structure to be satisfied between the search target structure ID and the search result structure ID;
a first acquiring unit that acquires, from the structure-index storing unit, the object ID corresponding to the search target structure ID that is associated with the discrimination information indicating that the vocabulary index is not attached to the structure ID from among search target structure IDs included in the hierarchical-type search condition;
a candidate generating unit that associates the search key, as a first constraint condition, with the object ID acquired by the first acquiring unit, and generates a candidate of the search result including the object ID that is associated with the first constraint condition;
a second acquiring unit that acquires the search result structure ID complying with the structure constraint defined in the hierarchical-type search condition, with respect to the search target structure ID corresponding to the object ID included in the candidate generated by the candidate generating unit; and
a result acquiring unit that acquires, from the structured-document storing unit, the object corresponding to the object ID satisfying the first constraint condition from among object IDs corresponding to acquired search result structure IDs.
4 Assignments
0 Petitions
Accused Products
Abstract
A condition generating unit generates a hierarchical-type search condition including a search target structure ID and a search result structure ID. A first acquiring unit acquires an object ID corresponding to the search target structure ID to which a vocabulary index is not attached. A candidate generating unit generates a candidate of the search result in which an acquired object ID is associated with the search key as a first constraint condition. A second acquiring unit acquires a search result structure ID complying with a structure constraint. A result acquiring unit acquires an object corresponding to the object ID satisfying the first constraint condition.
-
Citations
18 Claims
-
1. An apparatus for searching a structured document, comprising:
-
a structured-document storing unit that stores therein structured-document information having a hierarchized logical structure, the structured-document information including an object corresponding to a structure element and an object ID for identifying the object, the structure element being a unit of the logical structure and identified by a structure ID; a structure-index storing unit that stores therein a structure index in which the structure ID is associated with the object ID; a vocabulary-index storing unit that stores therein a vocabulary index in which a vocabulary ID for identifying a vocabulary included in the structured-document information is associated with the object ID; a structure-information storing unit that stores therein structure information on the structure element, the structure information including discrimination information indicating whether the vocabulary index is attached to the structure ID; a condition generating unit that associates a search key included in an input search condition with the structure ID that is a search target of the search key, generates a hierarchical-type search condition including, as a unit of a hierarchical structure, a search target structure ID that is a structure ID corresponding to the search key and a search result structure ID that is a structure ID to be acquired as a search result for the search condition, the hierarchical-type search condition defining a structure constraint regarding the hierarchical structure to be satisfied between the search target structure ID and the search result structure ID; a first acquiring unit that acquires, from the structure-index storing unit, the object ID corresponding to the search target structure ID that is associated with the discrimination information indicating that the vocabulary index is not attached to the structure ID from among search target structure IDs included in the hierarchical-type search condition; a candidate generating unit that associates the search key, as a first constraint condition, with the object ID acquired by the first acquiring unit, and generates a candidate of the search result including the object ID that is associated with the first constraint condition; a second acquiring unit that acquires the search result structure ID complying with the structure constraint defined in the hierarchical-type search condition, with respect to the search target structure ID corresponding to the object ID included in the candidate generated by the candidate generating unit; and a result acquiring unit that acquires, from the structured-document storing unit, the object corresponding to the object ID satisfying the first constraint condition from among object IDs corresponding to acquired search result structure IDs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method of searching a structured document, comprising:
-
storing structured-document information having a hierarchized logical structure in a structured-document storing unit, the structured-document information including an object corresponding to a structure element and an object ID for identifying the object, the structure element being a unit of the logical structure and identified by a structure ID; storing a structure index, in which the structure ID is associated with the object ID, in a structure-index storing unit; storing a vocabulary index, in which a vocabulary ID for identifying a vocabulary included in the structured-document information is associated with the object ID, in a vocabulary-index storing unit; storing structure information on the structure element in a structure-information storing unit, the structure information including discrimination information indicating whether the vocabulary index is attached to the structure ID; generating, by associating a search key included in an input search condition with the structure ID that is a search target of the search key, a hierarchical-type search condition including, as a unit of a hierarchical structure, a search target structure ID that is a structure ID corresponding to the search key and a search result structure ID that is a structure ID to be acquired as a search result for the search condition, the hierarchical-type search condition defining a structure constraint regarding the hierarchical structure to be satisfied between the search target structure ID and the search result structure ID; acquiring, from the structure-index storing unit, the object ID corresponding to the search target structure ID that is associated with the discrimination information indicating that the vocabulary index is not attached to the structure ID from among search target structure IDs included in the hierarchical-type search condition; generating, by associating the search key, as a first constraint condition, with the object ID acquired at the acquiring, a candidate of the search result including the object ID that is associated with the first constraint condition; acquiring the search result structure ID complying with the structure constraint defined in the hierarchical-type search condition, with respect to the search target structure ID corresponding to the object ID included in the candidate generated at the generating; and acquiring, from the structured-document storing unit, the object corresponding to the object ID satisfying the first constraint condition from among object IDs corresponding to acquired search result structure IDs.
-
-
18. A computer program product comprising a computer-usable medium having computer-readable program codes embodied in the medium that when executed cause a computer to execute:
-
storing structured-document information having a hierarchized logical structure in a structured-document storing unit, the structured-document information including an object corresponding to a structure element and an object ID for identifying the object, the structure element being a unit of the logical structure and identified by a structure ID; storing a structure index, in which the structure ID is associated with the object ID, in a structure-index storing unit; storing a vocabulary index, in which a vocabulary ID for identifying a vocabulary included in the structured-document information is associated with the object ID, in a vocabulary-index storing unit; storing structure information on the structure element in a structure-information storing unit, the structure information including discrimination information indicating whether the vocabulary index is attached to the structure ID; generating, by associating a search key included in an input search condition with the structure ID that is a search target of the search key, a hierarchical-type search condition including, as a unit of a hierarchical structure, a search target structure ID that is a structure ID corresponding to the search key and a search result structure ID that is a structure ID to be acquired as a search result for the search condition, the hierarchical-type search condition defining a structure constraint regarding the hierarchical structure to be satisfied between the search target structure ID and the search result structure ID; acquiring, from the structure-index storing unit, the object ID corresponding to the search target structure ID that is associated with the discrimination information indicating that the vocabulary index is not attached to the structure ID from among search target structure IDs included in the hierarchical-type search condition; generating, by associating the search key, as a first constraint condition, with the object ID acquired at the acquiring, a candidate of the search result including the object ID that is associated with the first constraint condition; acquiring the search result structure ID complying with the structure constraint defined in the hierarchical-type search condition, with respect to the search target structure ID corresponding to the object ID included in the candidate generated at the generating; and acquiring, from the structured-document storing unit, the object corresponding to the object ID satisfying the first constraint condition from among object IDs corresponding to acquired search result structure IDs.
-
Specification