FILE TABLE INDEX AGGREGATE STATISTICS
First Claim
1. A storage system, comprising:
- a stream layer comprising a plurality of storage nodes for storing user data; and
a partition layer comprising a plurality of table servers each configured to manage data storage in assigned partitions and to create a file table index for each partition, wherein the file table index comprises a tree structure having leaf pages having data sorted in key order and one or more levels of parent pages above the leaf pages, wherein the parent pages comprise indexing keys and pointers to one or more child pages, and wherein statistics are stored with the leaf page data and aggregated statistics are stored with the pointers in the parent pages.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments provide a method to collect aggregate information or usage data quickly and efficiently with minimal lag. Additionally, the system can use this aggregate information internally for improved load balancing, better data placement, optimization, and enhanced debugging. The system can quickly look at aggregate information across a huge amount of data and drill down cheaply because the aggregate information is generated using existing processes. Aggregated statistics storage and collection may be built on top of an LSM tree used to store a persistent index for a cloud storage system. The statistics may also represent the result of an operation (e.g., max, min, sum, average) on selected parameter(s) or attribute(s) of stored data. Aggregate statistics values may be efficiently maintained during index merge and garbage collection processes or any other index management. As delta LSM trees are merged into a base LSM tree, the aggregates are updated in delta fashion.
-
Citations
20 Claims
-
1. A storage system, comprising:
-
a stream layer comprising a plurality of storage nodes for storing user data; and a partition layer comprising a plurality of table servers each configured to manage data storage in assigned partitions and to create a file table index for each partition, wherein the file table index comprises a tree structure having leaf pages having data sorted in key order and one or more levels of parent pages above the leaf pages, wherein the parent pages comprise indexing keys and pointers to one or more child pages, and wherein statistics are stored with the leaf page data and aggregated statistics are stored with the pointers in the parent pages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method for aggregating statistics in a distributed storage system, comprising:
-
creating a file table index for data stored in the distributed storage system, the file table index comprising a tree structure having leaf pages with data sorted in key order and one or more levels of parent pages above the leaf pages, the parent pages comprise keys and pointers to one or more child pages; storing statistics for each key in the leaf pages; storing aggregated statistics with the pointers in the parent pages. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification