Method and apparatus for semantic search of schema repositories
First Claim
1. A method of finding repository schema similar to a query schema in repositories of metadata via semantic search, comprising the steps of:
- parsing said query schema to extract query words;
parsing at least one of said repository schema to extract repository words;
determining a match if a given proportion of said query words match a said repository word;
retaining each said repository schema in which at least one said match is found as a retained repository schema;
establishing a semantic matching for each said retained repository schema in which a given proportion of said query words matches a said repository word;
ranking each said semantic matching to determine a rank of said semantic matching; and
returning each said retained repository schema as a candidate if said rank of said semantic matching is greater than a predetermined value.
1 Assignment
0 Petitions
Accused Products
Abstract
Mechanisms for searching XML repositories for semantically related schemas from a variety of structured metadata sources, including web services, XSD documents and relational tables, in databases and Internet applications. A search is formulated as a problem of computing a maximum matching in pairwise bipartite graphs formed from query and repository schemas. The edges of such a bipartite graph capture the semantic similarity between corresponding attributes of the schema based on their name and type semantics. Tight upper and lower bounds are also derived on the maximum matching that can be used for fast ranking of matchings whilst still maintaining specified levels of precision and recall. Schema indexing is performed by ‘attribute hashing’, in which matching schemas of a database are found by indexing using query attributes, performing lower bound computations for maximum matching and recording peaks in the resulting histogram of hits.
74 Citations
21 Claims
-
1. A method of finding repository schema similar to a query schema in repositories of metadata via semantic search, comprising the steps of:
-
parsing said query schema to extract query words;
parsing at least one of said repository schema to extract repository words;
determining a match if a given proportion of said query words match a said repository word;
retaining each said repository schema in which at least one said match is found as a retained repository schema;
establishing a semantic matching for each said retained repository schema in which a given proportion of said query words matches a said repository word;
ranking each said semantic matching to determine a rank of said semantic matching; and
returning each said retained repository schema as a candidate if said rank of said semantic matching is greater than a predetermined value. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer readable medium having computer executable instructions for performing steps to find repository schema similar to a query schema in repositories of metadata via semantic search, comprising:
-
computer readable program code parsing said query schema to extract query words;
computer readable program code parsing at least one of said repository schema to extract repository words;
computer readable program code determining a match if a given proportion of said query words match a said repository word;
computer readable program code retaining each said repository schema in which at least one said match is found as a retained repository schema;
computer readable program code establishing a semantic matching for each said retained repository schema in which a given proportion of said query words matches a said repository word;
computer readable program code ranking each said semantic matching to determine a rank of said semantic matching; and
computer readable program code returning each said retained repository schema as a candidate if said rank of said semantic matching is greater than a predetermined value. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus for finding repository schema similar to a query schema in repositories of metadata via semantic search, comprising:
-
means for parsing said query schema to extract query words;
means for parsing at least one of said repository schema to extract repository words;
means for determining a match if a given proportion of said query words match a said repository word;
means for retaining each said repository schema in which at least one said match is found as a retained repository schema;
means for establishing a semantic matching for each said retained repository schema in which a given proportion of said query words matches a said repository word;
means for ranking each said semantic matching to determine a rank of said semantic matching; and
means for returning each said retained repository schema as a candidate if said rank of said semantic matching is greater than a predetermined value. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification