×

System and method for operating a big-data platform

  • US 9,582,528 B2
  • Filed: 05/05/2016
  • Issued: 02/28/2017
  • Est. Priority Date: 11/10/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method for operating a big-data platform comprising:

  • at a data analysis platform, receiving discrete client data;

    storing the client data in a network accessible distributed storage system that includes;

    storing the client data in a real-time storage system in a row format;

    merging the client data into a columnar-based distributed archive storage system;

    identifying a merge status for the client data merged into the archive storage system, wherein the merge status indicates a redundancy of client data between the real-time storage system and the archive storage system;

    receiving a data query through a query interface; and

    processing the data query by selectively interfacing with the client data from the real-time storage system and archive storage system, according to a data mapping and reduction process, wherein the real-time storage system and the archive storage system are different, wherein processing the data query comprises;

    (i) converting the single data query from a relational database-type query format to a first converted query format compatible with the real-time storage system,(ii) converting the single data query from the relational database-type query format to a second converted query format compatible with the archive storage system,(iii) cooperatively querying the real-time storage system and the archive storage system by distributing, in parallel, the first converted query over the real-time storage system and the second converted query over the archive storage system,(iv) using the merge status and timestamps of the client data in the real-time storage system and the archive storage system to skip client data from either the real-time storage system or the archive storage system if the skipped data is accounted for in the other of the real-time storage system or the archive storage system, and(v) retrieving a single cohesive query result that incorporates real-time data and archive data returned from the first converted query and the second converted query, respectively,wherein merging the client data into a columnar-based distributed archive storage system comprises storing the client data in the archive storage system in a columnar format, andwherein interfacing with the client data from the archive storage system comprises;

    converting, by using a query processing cluster, at least a portion of the data query to the mapping process and the reduction process; and

    executing the mapping process and the reduction process by using the query processing cluster.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×