SYSTMEN AND METHOD FOR DATA MANAGEMENT IN LARGE DATA NETWORKS
First Claim
1. A method of distributing a data network, in the form of a graph, across a plurality of compute nodes, the method comprising:
- receiving the data network in the form of a graph, the graph including a plurality of vertices connected by edges;
calculating a probability of co-retrieval for each of the plurality of vertices; and
assigning each of the plurality of vertices to one of the plurality of compute nodes based on the calculated probability of co-retrieval.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for storing an input data network, in the form of graph is provided. The system includes a master node and a plurality of slave nodes. The master node is operable to receive the data network in the form of a graph, the graph including a plurality of vertices connected by edges; calculate a probability of co-retrieval for each of the plurality of vertices; and assign each of the plurality of vertices to one of the plurality of compute nodes based on the calculated probability of co-retrieval. Another method and system are provided for converting a dataset into a graph based index and storing the index on disk. Respective systems and methods of querying such data networks are also provided.
-
Citations
4 Claims
-
1. A method of distributing a data network, in the form of a graph, across a plurality of compute nodes, the method comprising:
-
receiving the data network in the form of a graph, the graph including a plurality of vertices connected by edges; calculating a probability of co-retrieval for each of the plurality of vertices; and assigning each of the plurality of vertices to one of the plurality of compute nodes based on the calculated probability of co-retrieval.
-
-
2. A system for storing an input data network, in the form of graph, the system comprising:
-
a master node; and a plurality of slave nodes, wherein the master node is operable to; receive the data network in the form of a graph, the graph including a plurality of vertices connected by edges; calculate a probability of co-retrieval for each of the plurality of vertices; and assign each of the plurality of vertices to one of the plurality of compute nodes based on the calculated probability of co-retrieval.
-
-
3. A computer-readable storage medium storing instructions for enabling a computer to implement a method of distributing a data network, in the form of a graph, across a plurality of compute nodes, the method comprising:
-
receiving the data network in the form of a graph, the graph including a plurality of vertices connected by edges; calculating a probability of co-retrieval for each of the plurality of vertices; and assigning each of the plurality of vertices to one of the plurality of compute nodes based on the calculated probability of co-retrieval.
-
-
4. A method of answering a query for a data network stored across a plurality of compute nodes, the method comprising:
-
marking a constant vertex in the query and transmitting the query to one of the plurality of compute nodes storing the constant vertex, the query including the constant vertex and at least one variable vertex connected by a edge label; determining vertices connected to the marked constant vertex and preparing a candidate substitution list for the at least one variable vertex from the determined vertices by using the edge label connecting the at least one variable vertex and the constant vertex, the candidate substitution list including candidate vertices for the at least one variable vertex; and preparing an answer to the query based on the candidate substitution list.
-
Specification