Systems and methods for data storage and retrieval using virtual data sets
First Claim
1. A computer implemented method for storing data sets, the computer system comprising at least one processor, memory and a data store, the method comprising:
- (A) providing a data set information store storing information regarding a plurality of data sets, including information specifying whether each respective data set is realized in the data store;
(B) providing a relation store in the memory for storing a plurality of algebraic relations between the data sets;
(C) receiving a plurality of statements wherein each statement requests at least one of the data sets;
(D) composing a plurality of algebraic relations between data sets from the plurality of statements;
(E) storing the plurality of algebraic relations composed from the plurality of statements in the relation store;
(F) establishing a criteria for virtualization of data sets in the data set information store;
(G) identifying at least one data set that is realized in the data store and meets the criteria for virtualization;
(H) determining that the plurality of algebraic relations stored in the relation store includes at least one algebraic relation defining the identified data set based upon at least one other data set that is realized in the data store, wherein the at least one other data set is different than the identified data set and the algebraic relation comprises a respective first expression including a symbolic representation of at least the identified data set, a respective second expression including a symbolic representation of at least the one other data set that is realized in the data store, and a relational operator symbolically defining a mathematical relationship between the respective first expression and the respective second expression;
(I) removing the identified data set from the data store;
(J) changing the information regarding the identified data set in the data set information store to indicate that the identified data set is not realized in the data store;
(K) composing a plurality of collections of algebraic relations defining a requested data set, wherein the algebraic relation defining the identified data set is used to compose at least one of the collections of algebraic relations;
(L) applying an optimization criteria to select one of the collections of algebraic relations to calculate the requested data set; and
(M) using the selected collection of algebraic relations to calculate the requested data set.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for storing and accessing data using virtual data sets. Data sets may be removed from a data store and defined by algebraic relations between other data sets that are realized in the data store. A flag may be set to indicate that the data set is virtual. Criteria may be established for determining when a data set should be virtualized. For example, the criteria may be based on the size of the data set, the number of times it has been referenced and/or the frequency with which the data set has been accessed in the data store. A data set may also be optimized by partitioning the data set into subsets. The original data set may then be removed from the data store. An algebraic relation may be composed that defines the data set based on the subsets realized in the data store. The algebraic relation for the virtual data set may be used for optimizing access to other data sets even though the virtual data set is not realized.
-
Citations
17 Claims
-
1. A computer implemented method for storing data sets, the computer system comprising at least one processor, memory and a data store, the method comprising:
-
(A) providing a data set information store storing information regarding a plurality of data sets, including information specifying whether each respective data set is realized in the data store; (B) providing a relation store in the memory for storing a plurality of algebraic relations between the data sets; (C) receiving a plurality of statements wherein each statement requests at least one of the data sets; (D) composing a plurality of algebraic relations between data sets from the plurality of statements; (E) storing the plurality of algebraic relations composed from the plurality of statements in the relation store; (F) establishing a criteria for virtualization of data sets in the data set information store; (G) identifying at least one data set that is realized in the data store and meets the criteria for virtualization; (H) determining that the plurality of algebraic relations stored in the relation store includes at least one algebraic relation defining the identified data set based upon at least one other data set that is realized in the data store, wherein the at least one other data set is different than the identified data set and the algebraic relation comprises a respective first expression including a symbolic representation of at least the identified data set, a respective second expression including a symbolic representation of at least the one other data set that is realized in the data store, and a relational operator symbolically defining a mathematical relationship between the respective first expression and the respective second expression; (I) removing the identified data set from the data store; (J) changing the information regarding the identified data set in the data set information store to indicate that the identified data set is not realized in the data store; (K) composing a plurality of collections of algebraic relations defining a requested data set, wherein the algebraic relation defining the identified data set is used to compose at least one of the collections of algebraic relations; (L) applying an optimization criteria to select one of the collections of algebraic relations to calculate the requested data set; and (M) using the selected collection of algebraic relations to calculate the requested data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer implemented method for storing data sets, the computer system comprising at least one processor, memory and a data store, the method comprising:
-
providing a data set information store storing information regarding a plurality of data sets, including information specifying whether each respective data set is realized in the data store; providing a relation store in the memory for storing a plurality of algebraic relations between the data sets; receiving a plurality of statements wherein each statement requests at least one of the data sets; composing a plurality of algebraic relations between data sets from the plurality of statements; storing the plurality of algebraic relations composed from the plurality of statements in the relation store; selecting at least one data set from the data set information store that is realized in the data store; adding data sets to the data set information store that are subsets of the selected data set and realizing the added data sets in the data store; adding an algebraic relation to the relation store that defines the selected data set based on the added data sets, wherein the algebraic relation comprises a respective first expression including a symbolic representation of at least the selected data set, a respective second expression including at least a symbolic representation of each of the added data sets, and a relational operator symbolically defining a mathematical relationship between the respective first expression and the respective second expression; removing the selected data set from the data store; changing the information regarding the selected data set in the data set information store to indicate that the selected data set is not realized in the data store; composing a plurality of collections of algebraic relations defining a requested data set, wherein the algebraic relation defining the selected data set is used to compose at least one of the collections of algebraic relations; applying an optimization criteria to select one of the collections of algebraic relations to calculate the requested data set; and using the selected collection of algebraic relations to calculate the requested data set. - View Dependent Claims (12, 13, 14)
-
-
15. A computer implemented method for storing data sets, the computer system comprising at least one processor, memory and a data store, the method comprising:
-
providing a relation store in the memory for storing algebraic relations between data sets; providing at least a first data set, a second data set and a third data set stored in the data store, wherein the second data set and the third data set are each different than the first data set; providing a data set information store for storing information regarding a plurality of data sets, including information indicating that the first data set, the second data set and the third data set are realized in the data store; receiving a plurality of statements wherein each statement requests at least one of the data sets; composing a plurality of algebraic relations between data sets from the plurality of statements; storing the plurality of algebraic relations composed from the plurality of statements in the relation store; composing an algebraic relation that defines the first data set using at least the second data set and the third data set, wherein the algebraic relation comprises a respective first expression including a symbolic representation of the first data set, a respective second expression including at least a symbolic representation of the second data set and a symbolic representation of the third data set, and a relational operator symbolically defining a mathematical relationship between the respective first expression and the respective second expression; adding the algebraic relation to the relation store; removing the first data set from the data store after the algebraic relation has been added to the relation store; changing the information regarding the first data set in the data set information store to indicate that the first data set is not realized in the data store; composing a plurality of collections of algebraic relations defining a requested data set, wherein the algebraic relation defining the first data set is used to compose at least one of the collections of algebraic relations; applying an optimization criteria to select one of the collections of algebraic relations to calculate the requested data set; and using the selected collection of algebraic relations to calculate the requested data set. - View Dependent Claims (16, 17)
-
Specification