Tiering with pluggable storage system for parallel query engines
First Claim
Patent Images
1. A computer implemented method for managing data, comprising:
- determining, by a processor circuitry, a usage level of a file, wherein the file is stored in a first storage system;
moving, by the processor circuitry, the file to a second storage system based on the determined usage level of the file;
updating, by a catalog service, location information in a catalog based on the movement of the file, wherein the catalog stores location information indicating a location for a plurality of files located at a plurality storage systems corresponding to a plurality of namespaces, wherein the catalog is accessed by a universal node that is configured to interface with the first storage system and the second storage system, and wherein the universal node provides, to a client, a universal namespace across the first storage system and the second storage system so as to collectively present the plurality of namespaces as the universal namespace, wherein the file is accessible via the universal namespace such that the client accesses the file at a same address of the universal namespace regardless of whether the file is moved from the first storage system to the second storage system; and
performing at least a portion of a query on the file after updating location information in the catalog, wherein the universal node comprises a universal job tracker that tracks a status of a one or more jobs corresponding to the query.
9 Assignments
0 Petitions
Accused Products
Abstract
A method, article of manufacture, and apparatus for managing data. In some embodiments, this includes determining a usage level of a file, wherein the file is stored in a first storage system, moving the file to a second storage system based on the determined usage level of the file updating location information in a catalog based on the movement of the file, and performing at least a portion of a query on the file after updating location information in the catalog.
143 Citations
27 Claims
-
1. A computer implemented method for managing data, comprising:
-
determining, by a processor circuitry, a usage level of a file, wherein the file is stored in a first storage system; moving, by the processor circuitry, the file to a second storage system based on the determined usage level of the file; updating, by a catalog service, location information in a catalog based on the movement of the file, wherein the catalog stores location information indicating a location for a plurality of files located at a plurality storage systems corresponding to a plurality of namespaces, wherein the catalog is accessed by a universal node that is configured to interface with the first storage system and the second storage system, and wherein the universal node provides, to a client, a universal namespace across the first storage system and the second storage system so as to collectively present the plurality of namespaces as the universal namespace, wherein the file is accessible via the universal namespace such that the client accesses the file at a same address of the universal namespace regardless of whether the file is moved from the first storage system to the second storage system; and performing at least a portion of a query on the file after updating location information in the catalog, wherein the universal node comprises a universal job tracker that tracks a status of a one or more jobs corresponding to the query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A system for managing data, comprising a processor circuitry configured to:
-
determine a usage level of a file, wherein the file is stored in a first storage system; move the file to a second storage system based on the determined usage level of the file; update, by a catalog service, location information in a catalog based on the movement of the file, wherein the catalog stores location information indicating a location for a plurality of files located at a plurality storage systems corresponding to a plurality of namespaces, wherein the catalog is accessed by a universal node that is configured to interface with the first storage system and the second storage system and wherein the universal node provides, to a client, a universal namespace across the first storage system and the second storage system so as to collectively present the plurality of namespaces as the universal namespace, wherein the file is accessible via the universal namespace such that the client accesses the file at a same address of the universal namespace regardless of whether the file is moved from the first storage system to the second storage system; and perform at least a portion of a query on the file after updating location information in the catalog, wherein the universal node comprises a universal job tracker that tracks a status of a one or more jobs corresponding to the query. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A computer program product for processing data, comprising a non-transitory computer readable medium having program instructions implemented by a processor circuitry when executed, the program instructions comprising instructions for:
-
determining a usage level of a file, wherein the file is stored in a first storage system; moving the file to a second storage system based on the determined usage level of the file; updating, by a catalog service, location information in a catalog based on the movement of the file, wherein the catalog stores location information indicating a location for a plurality of files located at a plurality storage systems corresponding to a plurality of namespaces, wherein the catalog is accessed by a universal node that is configured to interface with the first storage system and the second storage system and wherein the universal node provides, to a client, a universal namespace across the first storage system and the second storage system so as to collectively present the plurality of namespaces as the universal namespace, wherein the file is accessible via the universal namespace such that the client accesses the file at a same address of the universal namespace regardless of whether the file is moved from the first storage system to the second storage system; and performing at least a portion of a query on the file after updating location information in the catalog, wherein the universal node comprises a universal job tracker that tracks a status of a one or more jobs corresponding to the query. - View Dependent Claims (23, 24, 25, 26, 27)
-
Specification