×

Generating data streams from pre-existing data sets

  • US 10,162,672 B2
  • Filed: 03/30/2016
  • Issued: 12/25/2018
  • Est. Priority Date: 03/30/2016
  • Status: Active Grant
First Claim
Patent Images

1. A system for processing data items within a data source via an on-demand code execution environment, the system comprising:

  • a non-transitory data store configured to implement a backlog cache indicating data items, from the data source, that have been identified for processing at the on-demand code execution environment as backlog items;

    one or more processors, in communication with the non-transitory data store, configured to;

    retrieve, for a set of data items within the data source, time data indicating points in time at which individual data items from the set of data items were created or modified within the data source;

    determine, from the time data, an estimated modification frequency for the data source, the estimated modification frequency indicating an estimated frequency at which data items within the data source are created or modified;

    obtaining a threshold period of time;

    utilize the estimated modification frequency for the data source, the time data, and an anticipated rate of processing of data items at the on-demand code execution system to establish a demarcation time for the data source that is expected to result in a completion, within the threshold period of time, of processing of data items created or modified in the data source after the demarcation time, wherein data items created or modified in the data source prior to the demarcation time are considered backlogged data items, and wherein the set of data items includes at least one data item created or modified in the data source after the demarcation time;

    enqueue within the backlog cache a first set of data items, from the data store, that were created or modified in the data source prior to the demarcation time;

    iteratively submit data stream calls to the on-demand code execution environment, the data stream calls requesting that the demand code execution environment process, by execution of a task, data items from the data source that were created or modified after the demarcation time;

    while data stream calls are submitted the on-demand code execution environment, submit backlog calls to the on-demand code execution environment, the backlog calls requesting that the demand code execution environment process, by execution of the task, data items from the backlog cache.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×