Distributed processing of streaming data records
First Claim
1. A method of providing interactive information related to streaming data records using an OLAP cube, the method comprising steps of:
- (i) sending, to a client in response to a request from said client, information related to a first set of predetermined dimensions stored in the OLAP cube, the first set of predetermined dimensions being derived from information in a first subset of the streaming data records;
(ii) receiving a new request from the client, the new request comprising a second set of predetermined dimensions having at least one dimension not present in the first set of predetermined dimensions;
(iii) parsing information in the streaming data records to identify a second subset of information relevant to the second set of predetermined dimensions;
(iv) receiving only the second subset of the information in the streaming data records at a plurality of distributed computational nodes, each node comprising a processor and a storage element;
(v) converting, at each computational node, a portion of the subset of the information in the received streaming data records into key-value pairs;
(vi) parsing, at each computational node, the converted key-value pairs of the subset of the information in the received streaming data records received at each said computational node to (i) identify matches of the keys to at least one predetermined dimension of the second set of predetermined dimensions and (ii) based thereon, combine the key-value pairs having identical keys;
(vii) re-distributing the keys of the converted subset of the received streaming data records among the distributed computational nodes in accordance with the second set of predetermined dimensions stored on the computational nodes, wherein each distributed computational node receives the key corresponding to one of the second set of predetermined dimensions, thereby reducing a size of the portion of the subset of information in the received streaming data records received at each computational node;
(viii) updating the OLAP cube to delete measures associated with only the first set of predetermined dimensions and to store measures associated with the second set of predetermined dimensions by collecting data from the computational nodes; and
(ix) sending, to the client in response to a request from said client, information related to the second set of predetermined dimensions.
1 Assignment
0 Petitions
Accused Products
Abstract
Representative embodiments of a distributed processing method of facilitating interactive analytics of streaming data records by receiving the data records at a plurality of distributed computational nodes, establishing and storing dimensions corresponding to attributes of the data records, parsing the streaming data records to identify matches to at least one of the dimensions and based thereon, reducing the number of data records to create a targeted subset of the data, re-distributing the targeted subsets of the streaming data records among the distributed computational nodes in accordance with the dimensions stored on the nodes, updating a database storing measures of the dimensions in accordance with the targeted subsets of the streaming data records, and using the database to respond to a query based on measures associated with one or more of the dimensions.
11 Citations
19 Claims
-
1. A method of providing interactive information related to streaming data records using an OLAP cube, the method comprising steps of:
-
(i) sending, to a client in response to a request from said client, information related to a first set of predetermined dimensions stored in the OLAP cube, the first set of predetermined dimensions being derived from information in a first subset of the streaming data records; (ii) receiving a new request from the client, the new request comprising a second set of predetermined dimensions having at least one dimension not present in the first set of predetermined dimensions; (iii) parsing information in the streaming data records to identify a second subset of information relevant to the second set of predetermined dimensions; (iv) receiving only the second subset of the information in the streaming data records at a plurality of distributed computational nodes, each node comprising a processor and a storage element; (v) converting, at each computational node, a portion of the subset of the information in the received streaming data records into key-value pairs; (vi) parsing, at each computational node, the converted key-value pairs of the subset of the information in the received streaming data records received at each said computational node to (i) identify matches of the keys to at least one predetermined dimension of the second set of predetermined dimensions and (ii) based thereon, combine the key-value pairs having identical keys; (vii) re-distributing the keys of the converted subset of the received streaming data records among the distributed computational nodes in accordance with the second set of predetermined dimensions stored on the computational nodes, wherein each distributed computational node receives the key corresponding to one of the second set of predetermined dimensions, thereby reducing a size of the portion of the subset of information in the received streaming data records received at each computational node; (viii) updating the OLAP cube to delete measures associated with only the first set of predetermined dimensions and to store measures associated with the second set of predetermined dimensions by collecting data from the computational nodes; and (ix) sending, to the client in response to a request from said client, information related to the second set of predetermined dimensions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for providing interactive information related to streaming data records using an OLAP cube, the system comprising:
-
an interactive cube manipulator for (i) sending, to a client in response to a request from said client, information related to a first set of predetermined dimensions stored in the OLAP cube, the first set of predetermined dimensions being derived from information in a first subset of the streaming data records and (ii) receiving a new request from the client, the new request comprising a second set of predetermined dimensions having at least one dimension not present in the first set of predetermined dimensions; a data reorganizer for; parsing information in the streaming data records to identify a second subset of information relevant to the second set of predetermined dimensions; converting, at each computational node, a portion of the subset of the information in the received streaming data records into key-value pairs; parsing, at each computational node, the converted key-value pairs of the subset of the information in the received streaming data records received at each said computational node to (i) identify matches of the keys to at least one predetermined dimension of the second set of predetermined dimensions and (ii) based thereon, combine the key-value pairs having identical keys; and re-distributing the keys of the converted subset of the received streaming data records among the distributed computational nodes in accordance with the second set of predetermined dimensions stored on the computational nodes, wherein each distributed computational node receives the key corresponding to one of the second set of predetermined dimensions, thereby reducing a size of the portion of the subset of information in the received streaming data records received at each computational node; a data collector for receiving only the second subset of the information in the streaming data records at a plurality of distributed computational nodes, each node comprising a processor and a storage element; and a cube constructor for updating the OLAP cube to delete measures associated with only the first set of predetermined dimensions and to store measures associated with the second set of predetermined dimensions by collecting data from the computational nodes, wherein the interactive cube manipulator is further configured for sending, to the client in response to a request from said client, information related to the second set of predetermined dimensions. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A system for providing interactive information related to streaming data records using an OLAP, the system comprising:
-
a user interface for (i) sending, to a client in response to a request from said client, information related to a first set of predetermined dimensions stored in the OLAP cube, the first set of predetermined dimensions being derived from information in a first subset of the streaming data records and (ii) receiving a new request from the client, the new request comprising a second set of predetermined dimensions having at least one dimension not present in the first set of predetermined dimensions; a processor for; parsing information in the streaming data records to identify a second subset of information relevant to the second set of predetermined dimensions; converting, at each computational node, a portion of the subset of the information in the received streaming data records into key-value pairs; parsing, at each computational node, the converted key-value pairs of the subset of the information in the received streaming data records received at each said computational node to (i) identify matches of the keys to at least one predetermined dimension of the second set of predetermined dimensions and (ii) based thereon, combine the key-value pairs having identical keys; and re-distributing the keys of the converted subset of the received streaming data records among the distributed computational nodes in accordance with the second set of predetermined dimensions stored on the computational nodes, wherein each distributed computational node receives the key corresponding to one of the second set of predetermined dimensions, thereby reducing a size of the portion of the subset of information in the received streaming data records received at each computational node; a plurality of computing nodes, each node comprising a processor and a local storage device, wherein each node receives only the second subset of the information in the streaming data records; and a network connecting the plurality of computing nodes, the network distributing and re-distributing the data records amongst the plurality of computing nodes, wherein the processor is further configured to update the OLAP cube to delete measures associated with only the first set of predetermined dimensions and to store measures associated with the second set of predetermined dimensions by collecting data from the computational nodes; and
the user interface is further configured to send, to the client in response to a request from said client, information related to the second set of predetermined dimensions. - View Dependent Claims (18, 19)
-
Specification