×

Pluggable storage system for parallel query engines across non-native file systems

  • US 9,984,083 B1
  • Filed: 03/29/2013
  • Issued: 05/29/2018
  • Est. Priority Date: 02/25/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method for managing data, comprising:

  • receiving, by one or more processors, a query from a client via one or more networks;

    based on the received query, analyzing a catalog, which stores mappings of file names and file locations, for location information, wherein the catalog is associated with a universal namenode that provides a single namespace for accessing a plurality of files stored across a plurality of storage systems, and wherein the location information stored in connection with the catalog indicates a storage system on which a file is located among the plurality of storage systems;

    based on the analysis, determining, by one or more processors, a first storage system of the plurality of storage systems, an associated first file system, an associated first protocol translator to use in connection with communication with the first storage system, a second storage system of the plurality of storage systems, an associated second file system, and an associated second protocol translator to use in connection with communication with the second storage system;

    identifying, by one or more processors, a first data and a second data, wherein the first data is stored on the first storage system, and the second data is stored on the second storage system, and wherein a first portion of the query is performed on the first storage system and a second portion of the query is performed on the second storage system, wherein the first storage system is different from the second storage system, and wherein a first protocol used in connection with communication with the first storage system is different from a second protocol used in connection with communication with the second storage system;

    running, by one or more processors, a first job on the first data using the associated first protocol translator, wherein the first job is not a native job of the first file system; and

    running, by one or more processors, a second job on the second data using the associated second protocol translator, wherein the second job is not a native job of the second file system.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×