Segment data visibility and management in a distributed database of time stamped records
First Claim
1. A non-transitory computer readable medium comprising computer executable instructions stored thereon to cause one or more processors to perform data storage and retrieval operations from a computer memory configured according to a distributed database of time stamped records collected into data segments, each data segment including data from a data source collected over a time interval, each data segment associated to a creation time that the data segment was created, each data segment stored on one of a plurality of query nodes, the operations comprising:
- building a timeline data structure for the data source and for a timeline view interval, the building of the timeline data structure comprising;
identifying, data segments that include data from the data source that was collected over a time interval included in the timeline view interval;
identifying, among the identified data segments, overlapping data segments that include overlapping portions of data collected over an overlapping time interval;
selecting the overlapping portion that is included in the overlapping segment having the most recent creation time; and
building the timeline data structure with the selected overlapping portion and with portions of the identified data segments that do not overlap with any portion of any other of the identified data segments.
14 Assignments
0 Petitions
Accused Products
Abstract
A distributed database of time stamped records can be used to store time series data such as events occurring on the Internet. A distributed database of time stamped records can store segments of data that contain events for different time intervals. The volume of events occurring on the Internet introduces a “Big Data” variable that makes collections of data sets so large and complex they are difficult to manage. Disclosed are systems and methods to manage segments of a distributed database of time stamped records for optimal size (for storage and performance reasons etc.) and for proper visibility to data when different segments contain data for overlapping time periods.
19 Citations
15 Claims
-
1. A non-transitory computer readable medium comprising computer executable instructions stored thereon to cause one or more processors to perform data storage and retrieval operations from a computer memory configured according to a distributed database of time stamped records collected into data segments, each data segment including data from a data source collected over a time interval, each data segment associated to a creation time that the data segment was created, each data segment stored on one of a plurality of query nodes, the operations comprising:
building a timeline data structure for the data source and for a timeline view interval, the building of the timeline data structure comprising; identifying, data segments that include data from the data source that was collected over a time interval included in the timeline view interval; identifying, among the identified data segments, overlapping data segments that include overlapping portions of data collected over an overlapping time interval; selecting the overlapping portion that is included in the overlapping segment having the most recent creation time; and building the timeline data structure with the selected overlapping portion and with portions of the identified data segments that do not overlap with any portion of any other of the identified data segments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A non-transitory computer readable medium comprising computer executable instructions stored thereon to cause one or more processing units to perform data storage and retrieval operations from a computer memory configured according to a distributed database of time stamped records collected into data segments, the operations comprising:
-
determining, from among the data segments, a plurality of merger segments to merge into a single merged segment based on at least one of a size of each of the merger segments, a size of a resulting merged segment, and machine resources providing infrastructure to the distributed database of time stamped records, each merger segment comprising segment data in the form of at least one of a dimension and a metric; determining at least one overlapping dimension included in every one of the plurality of merger segments to merge; combining merger segment data for each of the at least one overlapping dimension; determining at least one non-overlapping dimension that is not included in at least one of the plurality of merger segments to merge; and assigning a null value for each non-overlapping dimension. - View Dependent Claims (10, 11)
-
-
12. A method for storing and retrieving data from a computer memory, comprising:
-
configuring said computer memory according to a distributed database of time stamped records collected into data segments; determining, from among the data segments, a plurality of merger segments to merge into a single merged segment based on at least one of a size of each of the merger segments, a size of a resulting merged segment, and machine resources providing infrastructure to the distributed database of time stamped records, wherein each merger segment comprises segment data in the form of at least one of a dimension and a metric; determining at least one overlapping dimension included in every one of the plurality of merger segments to merge; combining merger segment data for each of the at least one overlapping dimension; determining at least one non-overlapping dimension that is not included in at least one of the plurality of merger segments to merge; assigning a null value for each non-overlapping dimension associating the merged segment with a time stamp corresponding to a time that the plurality of merger segments was merged into the single merged segment. - View Dependent Claims (13, 14, 15)
-
Specification