Federated search of multiple sources with conflict resolution
First Claim
1. A method comprising:
- obtaining a set of data ontologies associated with a plurality of heterogeneous data sources;
wherein each heterogeneous data source of the plurality of heterogeneous data sources uses a different data model;
receiving, as input, a selection of a graph comprising a plurality of graph nodes connected by one or more graph edges, wherein the graph has an object-centric data model in which each graph node represents a data object type or a data object property that is described in at least one data ontology of the set of data ontologies and each graph edge represents a data object link that represents a relationship between a pair of graph nodes and that is described in at least one data ontology of the set of data ontologies, wherein each graph edge of the one or more graph edges is selected from a set of available data object links;
transforming the graph into one or more search queries that are executed across the plurality of heterogeneous data sources;
obtaining a first data object and a second data object based on executing the one or more search queries across the plurality of heterogeneous data sources;
generating an intermediate data object based on grouping the first data object with the second data object;
generating a unique identifier for the intermediate data object based on hashing one or more data object properties that uniquely identify the intermediate data object;
determining whether a repository data object that shares the unique identifier is stored in a repository that has a particular data model;
in response to determining that the repository data object is not stored in a repository that has a particular data model,generating a stub data object that is referenced by the unique identifier and is stored in the repository;
resolving the intermediate data object with the stub data object;
deduplicating data associated with the intermediate data object and the stub data object;
storing the deduplicated data in the repository that has the particular data model;
causing display of the first data object and the second data object as a single graph node;
wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and apparatuses related to federated search of multiple sources with conflict resolution are disclosed. A method may comprise obtaining a set of data ontologies (e.g., types, properties, and links) associated with a plurality of heterogeneous data sources; receiving a selection of a graph comprising a plurality of graph nodes connected by one or more graph edges; and transforming the graph into one or more search queries across the plurality of heterogeneous data sources. A method may comprise obtaining a first data object as a result of executing a first search query across a plurality of heterogeneous data sources; resolving, based on one or more resolution rules, at least the first data object with a repository data object; deduplicating data associated with at least the first data object and the repository data object prior to storing the deduplicated data in a repository that has a particular data model.
-
Citations
14 Claims
-
1. A method comprising:
-
obtaining a set of data ontologies associated with a plurality of heterogeneous data sources; wherein each heterogeneous data source of the plurality of heterogeneous data sources uses a different data model; receiving, as input, a selection of a graph comprising a plurality of graph nodes connected by one or more graph edges, wherein the graph has an object-centric data model in which each graph node represents a data object type or a data object property that is described in at least one data ontology of the set of data ontologies and each graph edge represents a data object link that represents a relationship between a pair of graph nodes and that is described in at least one data ontology of the set of data ontologies, wherein each graph edge of the one or more graph edges is selected from a set of available data object links; transforming the graph into one or more search queries that are executed across the plurality of heterogeneous data sources; obtaining a first data object and a second data object based on executing the one or more search queries across the plurality of heterogeneous data sources; generating an intermediate data object based on grouping the first data object with the second data object; generating a unique identifier for the intermediate data object based on hashing one or more data object properties that uniquely identify the intermediate data object; determining whether a repository data object that shares the unique identifier is stored in a repository that has a particular data model; in response to determining that the repository data object is not stored in a repository that has a particular data model, generating a stub data object that is referenced by the unique identifier and is stored in the repository; resolving the intermediate data object with the stub data object; deduplicating data associated with the intermediate data object and the stub data object; storing the deduplicated data in the repository that has the particular data model; causing display of the first data object and the second data object as a single graph node; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
one or more processors; and one or more non-transitory storage media storing instructions which, when executed by the one or more processors, cause; obtaining a set of data ontologies associated with a plurality of heterogeneous data sources; wherein each heterogeneous data source of the plurality of heterogeneous data sources uses a different data model; receiving, as input, a selection of a graph comprising a plurality of graph nodes connected by one or more graph edges, wherein the graph has an object-centric data model in which each graph node represents a data object type or a data object property that is described in at least one data ontology of the set of data ontologies and each graph edge represents a data object link that represents a relationship between a pair of graph nodes and that is described in at least one data ontology of the set of data ontologies, wherein each graph edge of the one or more graph edges is selected from a set of available data object links; transforming the graph into one or more search queries that are executed across the plurality of heterogeneous data sources; obtaining a first data object and a second data object based on executing the one or more search queries across the plurality of heterogeneous data sources; generating an intermediate data object based on grouping the first data object with the second data object; generating a unique identifier for the intermediate data object based on hashing one or more data object properties that uniquely identify the intermediate data object; determining whether a repository data object that shares the unique identifier is stored in a repository that has a particular data model; in response to determining that the repository data object is not stored in a repository that has a particular data model, generating a stub data object that is referenced by the unique identifier and is stored in the repository; resolving the intermediate data object with the stub data object; deduplicating data associated with the intermediate data object and the stub data object; storing the deduplicated data in the repository that has the particular data model; causing display of the first data object and the second data object as a single graph node. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification