×

Systems and/or methods for statistical online analysis of large and potentially heterogeneous data sets

  • US 9,122,786 B2
  • Filed: 09/14/2012
  • Issued: 09/01/2015
  • Est. Priority Date: 09/14/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method of analyzing the behavior and parameters of a cache in a computer system over a temporal range of analysis, the method comprising:

  • receiving, over a first stream, notifications indicating that respective cache operations have been performed in connection with respective elements and the cache, each said operation having an operation type, the operation type being designated as one of an insert, update, or remove operation for the respective element; and

    for each received notification where a selected element attribute of interest is available therein;

    extracting, from the respective notification, information regarding a key of the respective element, the respective selected element attribute of interest, the respective operation type, and respective timestamp(s) associated with the respective operation; and

    computing value and validity distribution models using the extracted information;

    wherein the computing of the value distribution model, in connection with a given notification and an associated given element, comprises;

    updating a temporal buffer of inserted and not yet removed and/or updated elements to include an entry for the given element, the temporal buffer defining a range of elements to be considered in the computing of the value distribution model; and

    calculating a value distribution for the selected attribute of interest based on elements in the temporal buffer; and

    wherein the computing of the validity model, in connection with a given notification and an associated given element, comprises;

    ignoring the given notification when the given element has an insert operation type;

    calculating a validity value for the given element as a difference between first and second timestamps,for remove operation types, the first timestamp indicating when the given element was removed and the second timestamp indicating when the given element was inserted, andfor update operation types, the first timestamp indicating when an old element was removed and the given element was inserted and the second timestamp indicating when the old element was inserted;

    ignoring the given notification and the given element when the validity value is greater than a window size corresponding to the temporal range of analysis; and

    when the validity value is less than or equal to the window size;

    determining a temporal partition of the temporal range of analysis into which the attribute of interest associated with the given element falls; and

    publishing an event to a second stream, the event indicating the validity value and the determined temporal partition; and

    running a query on the second stream in order to derive summary statistics for validity values in the partitions.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×