REAL TIME ANALYTICS USING HYBRID HISTOGRAMS
First Claim
1. A system for processing a stream of data events, comprising:
- a hybrid histogram that provides a running statistical summary of the stream of data events, wherein the hybrid histogram includes a plurality of percentile ranges, a set of boundary values that separate the percentile ranges, and a count associated with each of the percentile ranges;
a histogram processing system for identifying a percentile range from the plurality of percentile ranges into which a new data event value falls, and for incrementing the count associated with the identified percentile range;
a boundary recalculation system for periodically recalculating the boundary values such that each percentile range includes a substantially similar number of associated counts; and
an analysis system that analyzes the hybrid histogram.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, method and program product for processing a stream of data events using hybrid histograms. A system is provided that includes: a hybrid histogram that provides a running statistical summary of the stream of data events, wherein the hybrid histogram includes a plurality of percentile ranges, a set of boundary values that separate the percentile ranges, and a count associated with each of the percentile ranges; a histogram processing system for identifying a percentile range from the plurality of percentile ranges into which a new data event value falls, and for incrementing the count associated with the identified percentile range; a periodic boundary recalculation system for periodically recalculating the boundary values such that each percentile range includes a substantially similar number of associated counts; and an analysis system that analyzes the hybrid histogram.
31 Citations
26 Claims
-
1. A system for processing a stream of data events, comprising:
-
a hybrid histogram that provides a running statistical summary of the stream of data events, wherein the hybrid histogram includes a plurality of percentile ranges, a set of boundary values that separate the percentile ranges, and a count associated with each of the percentile ranges; a histogram processing system for identifying a percentile range from the plurality of percentile ranges into which a new data event value falls, and for incrementing the count associated with the identified percentile range; a boundary recalculation system for periodically recalculating the boundary values such that each percentile range includes a substantially similar number of associated counts; and an analysis system that analyzes the hybrid histogram. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product stored on a computer readable medium, which when executed, processes a stream of data events, the program product comprising:
-
computer program code configured for updating a hybrid histogram data object that stores a running statistical summary of the stream of data events, wherein the hybrid histogram data object includes a plurality of percentile ranges, a set of boundary values that separate the percentile ranges, and a count associated with each of the percentile ranges; computer program code configured for identifying a percentile range from the plurality of percentile ranges into which a new data event value falls, and for incrementing the count associated with the identified percentile range; computer program code configured for periodically recalculating the boundary values such that each percentile range includes a substantially similar number of associated counts; and computer program code configured for analyzing the hybrid histogram data object. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method of processing a stream of data events, comprising:
-
providing a hybrid histogram that stores a running statistical summary of the stream of data events, wherein the hybrid histogram includes a plurality of percentile ranges, a set of boundary values that separate the percentile ranges, and a count associated with each of the percentile ranges; obtaining a new data event value; identifying a percentile range from the plurality of percentile ranges into which the new data event value falls; incrementing the count associated with the identified percentile range; periodically recalculating the boundary values such that each percentile range includes a substantially similar number of associated counts; and analyzing the hybrid histogram. - View Dependent Claims (22, 23, 24, 25, 26)
-
Specification