Two-phase construction of data graphs from disparate inputs
First Claim
1. A computer system comprising:
- memory storing a first source data graph in a first identifier space, the first identifier space uniquely identifying items in the first source;
memory storing a reconciled version of a second source data graph, the second source data graph being in a second identifier space and the reconciled version of the second source data graph being in a third identifier space, the second identifier space uniquely identifying items in the second source and the third identifier space uniquely identifying items to the computer system, wherein the first identifier space, the second identifier space, and the third identifier space differ from each other;
memory storing a master evidence file that maps the first identifier space to the third identifier space and the second identifier space to the third identifier space;
at least one processor; and
memory storing instructions that, when executed by the at least one processor cause the system to;
generate a reconciled version of the first source data graph by substituting identifiers in the first source data graph with identifiers in the third identifier space using the master evidence file,store the reconciled version of the first source data graph, andgenerate a combined data graph from the reconciled version of the first source data graph and the reconciled version of the second source data graph, the combined data graph being available for querying.
2 Assignments
0 Petitions
Accused Products
Abstract
Some implementations generate multiple views of a combined data graph from disparate data graph sources in two phases. A first phase may convert each source data graph into a reconciled data graph and a second phase may generate a combined data graph from the various reconciled data graphs. For example, a method may include generating a reconciled data graph for each of a plurality of source data graphs and determining selected sources identified by a graph view file. The selected sources may be a subset of the plurality of sources represented by the source data graphs. The method may also include generating a combined data graph using the reconciled data graphs that correspond with the selected sources, and generating search results using the combined data graph.
22 Citations
21 Claims
-
1. A computer system comprising:
-
memory storing a first source data graph in a first identifier space, the first identifier space uniquely identifying items in the first source; memory storing a reconciled version of a second source data graph, the second source data graph being in a second identifier space and the reconciled version of the second source data graph being in a third identifier space, the second identifier space uniquely identifying items in the second source and the third identifier space uniquely identifying items to the computer system, wherein the first identifier space, the second identifier space, and the third identifier space differ from each other; memory storing a master evidence file that maps the first identifier space to the third identifier space and the second identifier space to the third identifier space; at least one processor; and memory storing instructions that, when executed by the at least one processor cause the system to; generate a reconciled version of the first source data graph by substituting identifiers in the first source data graph with identifiers in the third identifier space using the master evidence file, store the reconciled version of the first source data graph, and generate a combined data graph from the reconciled version of the first source data graph and the reconciled version of the second source data graph, the combined data graph being available for querying. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method comprising:
-
generating, using at least one processor, a first reconciled data graph from a first source data graph using a master evidence file, the first reconciled data graph including a first set of triples, wherein the first reconciled data graph is in a global identifier space and the first source data graph is in a first local identifier space, the master evidence file mapping the first local identifier space to the global identifier space and a second local identifier space to the global identifier space; generating, using the at least one processor, a second reconciled data graph from a second source data graph using the master evidence file, wherein the second reconciled data graph includes a second set of triples and the first source data graph differs from the second source data graph, wherein the second reconciled data graph is in the global identifier space and the second source data graph is in the second local identifier space, the first local identifier space differing from the second local identifier space; generating a combined data graph from the first reconciled data graph and the second reconciled data graph on a periodic basis by; appending the second set of triples to the first set of triples, identifying a first triple in the first set of triples that matches a second triple in the second set of triples, updating a source attribute for the second triple to reflect a value for the first source, and deleting the first triple; and making the combined data graph available for querying. - View Dependent Claims (12, 13, 14)
-
-
15. A method comprising:
-
generating a reconciled version of a first source data graph in a first identifier space by substituting identifiers in the first source data graph with identifiers in a third identifier space using a master evidence file, the first identifier space uniquely identifying items in the first source, and the third identifier space uniquely identifying items to a computer system, and the master evidence file mapping the first identifier space to the third identifier space and a second identifier space to the third identifier space; generating a reconciled version of a second source data graph, the second source data graph being in the second identifier space and the reconciled version of the second source data graph being in the third identifier space, the second identifier space uniquely identifying items in the second source, wherein the first identifier space, the second identifier space, and the third identifier space differ from each other; generating a combined data graph from the reconciled version of the first source data graph and the reconciled version of the second source data graph; and making the combined data graph available for querying. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification