System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model
First Claim
1. A computer-implemented method for retrieving semantically relevant information units from a collection of semantically annotated indexed information units in response to a query, the method comprising:
- receiving, by a computer system, a semantically annotated query, the semantically annotated query including a set of semantic annotations to individuals or classes within a determined populated base ontology;
embedding, by the computer system, the semantically annotated query in a semantic representation space of an ontology-based IR model that uses a metric space for the representation of the indexed information units, the semantically annotated query being embedded as a set of weighted-mentions to individuals or classes within the populated base ontology;
obtaining, by the computer system, the representation in the semantic representation space for every indexed information unit of the collection;
computing, by the computer system, the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection, wherein the Hausdorff distance is based on the weighted distance of the metric space defined as the shortest IC-based weighted-path between two ontology nodes, wherein the weighted distance of the metric space is the sum of IC-based weights for all the edges along the shortest weighted-path joining the ontology nodes;
retrieving and ranking, by the computer system, the relevant information units based on the computed Hausdorff distance,wherein the weights of the edges are defined by the information-content value of the joint probability P(ci|cj) between any child concept ci and its parent concept cj, and the joint probability P(ci|cj) is
1 Assignment
0 Petitions
Accused Products
Abstract
System and method for the indexing and retrieval of semantically annotated information units from a collection of semantically annotated indexed information units in response to a query using an ontology-based IR model. The retrieval method comprises: receiving a semantically annotated query with semantic annotations to individuals or classes within a determined populated base ontology; embedding, as a set of weighted-mentions to individuals or classes within the populated base ontology, the semantically annotated query in a semantic representation space of an ontology-based metric space IR model; obtaining the representation in the semantic representation space for every indexed information unit; computing the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection; retrieving and ranking, the relevant information units based on the computed Hausdorff distance.
14 Citations
53 Claims
-
1. A computer-implemented method for retrieving semantically relevant information units from a collection of semantically annotated indexed information units in response to a query, the method comprising:
-
receiving, by a computer system, a semantically annotated query, the semantically annotated query including a set of semantic annotations to individuals or classes within a determined populated base ontology; embedding, by the computer system, the semantically annotated query in a semantic representation space of an ontology-based IR model that uses a metric space for the representation of the indexed information units, the semantically annotated query being embedded as a set of weighted-mentions to individuals or classes within the populated base ontology; obtaining, by the computer system, the representation in the semantic representation space for every indexed information unit of the collection; computing, by the computer system, the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection, wherein the Hausdorff distance is based on the weighted distance of the metric space defined as the shortest IC-based weighted-path between two ontology nodes, wherein the weighted distance of the metric space is the sum of IC-based weights for all the edges along the shortest weighted-path joining the ontology nodes; retrieving and ranking, by the computer system, the relevant information units based on the computed Hausdorff distance, wherein the weights of the edges are defined by the information-content value of the joint probability P(ci|cj) between any child concept ci and its parent concept cj, and the joint probability P(ci|cj) is - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 32, 33, 34)
-
-
16. A semantic search system for retrieving semantically relevant information units from a collection of semantically annotated indexed information units in response to a query, the semantic search system comprising a processor and a memory coupled with and readable by the processor and storing a set of instructions which, when executed by the processor, causes the processor to:
-
receive a semantically annotated query, the semantically annotated query including a set of semantic annotations to individuals or classes within a determined populated base ontology; embed the semantically annotated query in a semantic representation space of an ontology-based IR model that uses a metric space for the representation of the indexed information units, the semantically annotated query being embedded as a set of weighted-mentions to individuals or classes within the populated base ontology; obtain the representation in the semantic representation space for every indexed information unit of the collection; compute the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection, wherein the Hausdorff distance is based on the weighted distance of the metric space defined as the shortest IC-based weighted-path between two ontology nodes, wherein the weighted distance of the metric space is the sum of IC-based weights for all the edges along the shortest weighted-path joining the ontology nodes; retrieve and rank the relevant information units based on the computed Hausdorff distance wherein the weights of the edges are defined by the information-content value of the joint probability P(ci|cj) between any child concept ci and its parent concept cj. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 38)
-
-
31. A computer-readable memory for retrieving semantically relevant information units from a collection of semantically annotated indexed information units in response to a query, the computer-readable memory comprising a set of instructions stored therein which, when executed by a processor, causes the processor to:
-
receive a semantically annotated query, the semantically annotated query including a set of semantic annotations to individuals or classes within a determined populated base ontology; embed the semantically annotated query in a semantic representation space of an ontology-based IR model that uses a metric space for the representation of the indexed information units, the semantically annotated query being embedded as a set of weighted-mentions to individuals or classes within the populated base ontology; obtain the representation in the semantic representation space for every indexed information unit of the collection; compute the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection, wherein the Hausdorff distance is based on the weighted distance of the metric space defined as the shortest IC-based weighted-path between two ontology nodes, wherein the weighted distance of the metric space is the sum of IC-based weights for all the edges along the shortest weighted-path joining the ontology nodes; retrieve and rank the relevant information units based on the computed Hausdorff distance, wherein the weights of the edges are defined by the information-content value of the joint probability P(ci|cj) between any child concept ci and its parent concept cj, or by the information-content value of the child concept ci minus the information-content value of the parent concept cj.
-
-
35. A computer-implemented method for retrieving semantically relevant information units from a collection of semantically annotated indexed information units in response to a query, the method comprising:
-
receiving, by a computer system, a semantically annotated query, the semantically annotated query including a set of semantic annotations to individuals or classes within a determined populated base ontology; embedding, by the computer system, the semantically annotated query in a semantic representation space of an ontology-based IR model that uses a metric space for the representation of the indexed information units, the semantically annotated query being embedded as a set of weighted-mentions to individuals or classes within the populated base ontology; obtaining, by the computer system, the representation in the semantic representation space for every indexed information unit of the collection; computing, by the computer system, the Hausdorff distance between the space representation of the query and the space representation of all the indexed information units of the collection, wherein the Hausdorff distance is based on the weighted distance of the metric space defined as the shortest IC-based weighted-path between two ontology nodes, wherein the weighted distance of the metric space is the sum of IC-based weights for all the edges along the shortest weighted-path joining the ontology nodes; retrieving and ranking, by the computer system, the relevant information units based on the computed Hausdorff distance, wherein the weights of the edges are defined by the information-content value of the child concept ci minus the information-content value of the parent concept cj, w(eij)=IC(ci)−
IC(cj), and the weights of the edges w (eij) isw(eij)=log2|children|(cj), wherein |children(ci)| is the number of direct child concepts; - View Dependent Claims (36, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53)
-
Specification