METHOD AND APPARATUS FOR PROCESSING ELECTRONIC DATA
First Claim
1. A method of generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure and a second representation of a set of concepts or of a data structure, each representation comprising a plurality of complex representational elements each of which may itself include a number of associated subordinate representational elements, the method comprising:
- calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation; and
generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements;
whereincalculation of a semantic similarity measure includes;
using a linked top ontology data structure comprising a plurality of concept nodes arranged to form a top ontology, the top ontology being a partial subset of a full ontology having at least twice as many nodes as the top ontology, the nodes in the top ontology being selected from the full ontology based on their ancestral closeness to a root node and/or their ancestral remoteness from a leaf node of the full ontology, the linked top ontology further comprising a plurality of pre-processed vocabulary terms each of which is linked to one or more of the nodes in the top ontology, the linked top ontology data structure being used as follows;
the names of the subordinate elements between whom a semantic similarity is to be calculated being compared with the vocabulary terms and for any vocabulary terms which match the names of the subordinate elements, identifying the top ontology nodes associated with the matched vocabulary terms and comparing the identified top ontology nodes associated with each name of the subordinate elements, and determining a semantic similarity based on the degree of commonality between the top ontology nodes associated with each of the subordinate elements.
1 Assignment
0 Petitions
Accused Products
Abstract
A system (100) for generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure (e.g. a database schema) and a second representation of a set of concepts or of a data structure (e.g. an ontology), each representation comprising a plurality of complex representational elements (e.g. tables in a database schema and concepts in an ontology) each of which may itself include a number of associated subordinate representational elements (e.g. columns/fields of a table in a database schema and attributes of a concept in an ontology). The system (100) includes a semantic similarity calculation module (134) for calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation and a mapping generation module (137) for generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements.
-
Citations
9 Claims
-
1. A method of generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure and a second representation of a set of concepts or of a data structure, each representation comprising a plurality of complex representational elements each of which may itself include a number of associated subordinate representational elements, the method comprising:
-
calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation; and generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements;
whereincalculation of a semantic similarity measure includes; using a linked top ontology data structure comprising a plurality of concept nodes arranged to form a top ontology, the top ontology being a partial subset of a full ontology having at least twice as many nodes as the top ontology, the nodes in the top ontology being selected from the full ontology based on their ancestral closeness to a root node and/or their ancestral remoteness from a leaf node of the full ontology, the linked top ontology further comprising a plurality of pre-processed vocabulary terms each of which is linked to one or more of the nodes in the top ontology, the linked top ontology data structure being used as follows; the names of the subordinate elements between whom a semantic similarity is to be calculated being compared with the vocabulary terms and for any vocabulary terms which match the names of the subordinate elements, identifying the top ontology nodes associated with the matched vocabulary terms and comparing the identified top ontology nodes associated with each name of the subordinate elements, and determining a semantic similarity based on the degree of commonality between the top ontology nodes associated with each of the subordinate elements. - View Dependent Claims (2, 3, 4, 7, 8, 9)
-
-
5. A system for generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure and a second representation of a set of concepts or of a data structure, each representation comprising a plurality of complex representational elements each of which may itself include a number of associated subordinate representational elements, the system including:
-
a semantic similarity calculation module for calculating a semantic similarity measure between a subordinate element of the first representation and each of the subordinate elements in the second representation; and mapping generation module for generating a mapping between the subordinate element of the first representation and one of the subordinate elements of the second representation selected in dependence upon the calculated semantic similarity measures between the subordinate elements;
whereinthe system further includes a linked top ontology module storing a linked top ontology data structure which comprises a plurality of concept nodes arranged into a top ontology, the top ontology being a partial subset of a full ontology having at least twice as many nodes as the top ontology, the nodes in the top ontology being selected from the full ontology based on their ancestral closeness to a root node and/or their ancestral remoteness from a leaf node of the full ontology, the linked top ontology data structure further comprising a plurality of pre-processed vocabulary terms each of which is linked to one or more of the nodes in the top ontology; and
whereinthe semantic similarity calculation module is operable to compare the names of the subordinate elements between whom a semantic similarity is to be calculated with the vocabulary terms and, for any vocabulary terms which match the names of the subordinate elements, to identify the top ontology nodes associated with the matched vocabulary terms and to compare the identified top ontology nodes associated with each name of the subordinate elements, and to determine a semantic similarity based on the degree of commonality between the top ontology nodes associated with each of the subordinate elements. - View Dependent Claims (6)
-
Specification