×

Distributed processing of streaming data records

  • US 8,738,650 B2
  • Filed: 05/23/2013
  • Issued: 05/27/2014
  • Est. Priority Date: 05/22/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for distributed processing of streaming data records, the method comprising:

  • parsing information in a received stream of data records to thereby identify a subset of the information relevant to a set of predetermined dimensions;

    receiving only the subset of the information in the received streaming data records at a plurality of distributed computational nodes based at least in part on workloads of the distributed computational nodes, properties of the streaming data records, or a random assignment, each node comprising a processor and a storage element;

    converting, at each node, a portion of the subset of the information in the received streaming data records into key-value pairs;

    parsing, at each node, the converted key-value pairs of the subset of the information in the received streaming data records received at each said node to (i) identify matches of the keys to at least one predetermined dimension and (ii) based thereon, combine the key-value pairs having identical keys;

    re-distributing the keys of the converted subset of the received streaming data records among the distributed computational nodes in accordance with the predetermined dimensions stored on the nodes, wherein each distributed computational node receives the key corresponding to the predetermined dimension, thereby reducing a size of the portion of the subset of information in the received streaming data records received at each node;

    updating a database storing measures of the dimensions by collecting data from the computational nodes in accordance with the parsed and re-distributed streaming data records; and

    using the database to respond to a query based on measures associated with one or more of the dimensions.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×