Method and apparatus for accelerated data translation using record layout detection
First Claim
Patent Images
1. A method for low latency and high throughput data translation using record layout detection, the method comprising:
- a processor receiving a plurality of records, the records comprising record data content arranged in a first format, but exhibiting a plurality of different record layouts for the first format, wherein the processor comprises a processing pipeline that is controllable to translate records to a second format from any of the different record layouts for the first format, and wherein the processing pipeline is deployed on at least one of a reconfigurable logic device, a graphics processing unit (GPU), a multi-core processor, and a cell processor;
the processor analyzing the record data content with respect to a plurality of conditions corresponding to the different record layouts to determine the record layouts for the records from among the different record layouts, wherein the processor includes a plurality of data analysis components arranged in parallel, and wherein the analyzing step, for each of a plurality of the records, comprises;
the processor processing a record through the data analysis components in parallel, each of a plurality of the data analysis components (1) listening for data of interest in the record based on a byte offset from a start of record, (2) testing the record data content in the data of interest against a corresponding predicate condition to determine whether the corresponding predicate condition has been met, and (3) outputting data indicative of whether the tested record data content satisfies the corresponding predicate condition, wherein the corresponding predicate conditions for the data analysis components in the aggregate serve as criteria for determining whether the record exhibits any of the different record layouts;
based on the output data from the data analysis components, the processor determining whether the record exhibits one of the different record layouts; and
in response to a determination that the record exhibits one of the different record layouts, the processor generating data that associates the record with its determined record layout;
defining the byte offsets for data of interest with respect to data analysis components based on data in a configuration table;
the processor streaming the records and data indicative of their determined record layouts through the processing pipeline;
controlling the processing pipeline based on the data indicative of the determined record layouts such that the controlled processing pipeline is configured to translate record data content arranged in the determined record layouts to the second format; and
the controlled processing pipeline translating the records by simultaneously performing a plurality of translation tasks on different portions of the streaming record data content to arrange the record data content in the second format.
1 Assignment
0 Petitions
Accused Products
Abstract
Various methods and apparatuses are described for performing high speed translations of data. In an example embodiment, record layout detection can be performed for data. In another example embodiment, data pivoting prior to field-specific data processing can be performed.
343 Citations
64 Claims
-
1. A method for low latency and high throughput data translation using record layout detection, the method comprising:
-
a processor receiving a plurality of records, the records comprising record data content arranged in a first format, but exhibiting a plurality of different record layouts for the first format, wherein the processor comprises a processing pipeline that is controllable to translate records to a second format from any of the different record layouts for the first format, and wherein the processing pipeline is deployed on at least one of a reconfigurable logic device, a graphics processing unit (GPU), a multi-core processor, and a cell processor; the processor analyzing the record data content with respect to a plurality of conditions corresponding to the different record layouts to determine the record layouts for the records from among the different record layouts, wherein the processor includes a plurality of data analysis components arranged in parallel, and wherein the analyzing step, for each of a plurality of the records, comprises; the processor processing a record through the data analysis components in parallel, each of a plurality of the data analysis components (1) listening for data of interest in the record based on a byte offset from a start of record, (2) testing the record data content in the data of interest against a corresponding predicate condition to determine whether the corresponding predicate condition has been met, and (3) outputting data indicative of whether the tested record data content satisfies the corresponding predicate condition, wherein the corresponding predicate conditions for the data analysis components in the aggregate serve as criteria for determining whether the record exhibits any of the different record layouts; based on the output data from the data analysis components, the processor determining whether the record exhibits one of the different record layouts; and in response to a determination that the record exhibits one of the different record layouts, the processor generating data that associates the record with its determined record layout; defining the byte offsets for data of interest with respect to data analysis components based on data in a configuration table; the processor streaming the records and data indicative of their determined record layouts through the processing pipeline; controlling the processing pipeline based on the data indicative of the determined record layouts such that the controlled processing pipeline is configured to translate record data content arranged in the determined record layouts to the second format; and the controlled processing pipeline translating the records by simultaneously performing a plurality of translation tasks on different portions of the streaming record data content to arrange the record data content in the second format. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 62, 63, 64)
-
-
30. An apparatus for low latency and high throughput data translation using record layout detection, the apparatus comprising:
-
a processor that includes a processing pipeline, wherein the processing pipeline is deployed on at least one of a reconfigurable logic device, a graphics processing unit (GPU), a multi-core processor, and a cell processor; wherein the processor further comprises a plurality of data analysis components and a logic component; the processor configured to receive a plurality of records, the records comprising record data content arranged in the first format, but exhibiting a plurality of different record layouts for the first format; wherein the processor is further configured to perform record layout detection for each of a plurality of the records via a plurality of operations that (1) process a record through a plurality of the data analysis components in parallel, each of a plurality of the data analysis components configured to (i) listen for data of interest in the record based on a byte offset from a start of record, (ii) test the record data content in the data of interest against a corresponding predicate condition to determine whether the corresponding predicate condition has been met, and (iii) output data indicative of whether the tested record data content satisfies the corresponding predicate condition, wherein the corresponding predicate conditions for the data analysis components in the aggregate serve as criteria for determining whether the record exhibits any of the different record layouts, (2) based on the output data from the data analysis components, determine via the logic component whether the record exhibits one of the different record layouts, and (3) in response to a determination that the record exhibits on of the different record layouts, generate data that associates the record with its determined record layout; wherein the processor is further configured to define the byte offsets for data of interest with respect to data analysis components based on data in a configuration table; wherein the processor is further configured to stream the records and data indicative of their determined record layouts through the processing pipeline; and wherein the processing pipeline is controllable based on the data indicative of the determined record layouts such that the controlled processing pipeline is configured to translate the streaming records by simultaneously performing a plurality of translation tasks on different portions of the streaming record data content having the determined record layouts to arrange the record data content in the second format. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61)
-
Specification