Normalizing data for fast superscalar processing
First Claim
1. A computer system for storing and processing data in a manner that encourages parallel processing by one or more superscalar processors, the system comprising:
- a processor and memory configured to execute software instructions;
a data storage component configured to store database data persistently between sessions of use of the system;
a data normalization component configured to retrieve data stored by the data storage component and to load the retrieved data into memory in a normalized data representation that allows fast superscalar processing, wherein the normalized data representation is a fixed size value and is sized such that it can be processed by the one or more superscalar processors as a single instruction;
an operation manager configured to manage requests to perform database operations on stored database data;
a batch assembly component configured to identify batches of data that have control flow and data independence such that the batch includes parallelizable operations;
an outlier identification component configured to identify data values in a batch of data that cannot be performed by a fast processing path that performs efficient superscalar processing;
a fast operation component configured to provide instructions to a superscalar processor in a manner that allows parallel execution of the instructions by multiple functional units of the superscalar processor;
a slow operation component configured to perform database operations on data within a batch that is not stored in the normalized data representation; and
a result processing component configured to gather results from the fast operation component and slow operation component and return the results to an operation requestor.
1 Assignment
0 Petitions
Accused Products
Abstract
A data normalization system is described herein that represents multiple data types that are common within database systems in a normalized form that can be processed uniformly to achieve faster processing of data on superscalar CPU architectures. The data normalization system includes changes to internal data representations of a database system as well as functional processing changes that leverage normalized internal data representations for a high density of independently executable CPU instructions. Because most data in a database is small, a majority of data can be represented by the normalized format. Thus, the data normalization system allows for fast superscalar processing in a database system in a variety of common cases, while maintaining compatibility with existing data sets.
11 Citations
15 Claims
-
1. A computer system for storing and processing data in a manner that encourages parallel processing by one or more superscalar processors, the system comprising:
-
a processor and memory configured to execute software instructions; a data storage component configured to store database data persistently between sessions of use of the system; a data normalization component configured to retrieve data stored by the data storage component and to load the retrieved data into memory in a normalized data representation that allows fast superscalar processing, wherein the normalized data representation is a fixed size value and is sized such that it can be processed by the one or more superscalar processors as a single instruction; an operation manager configured to manage requests to perform database operations on stored database data; a batch assembly component configured to identify batches of data that have control flow and data independence such that the batch includes parallelizable operations; an outlier identification component configured to identify data values in a batch of data that cannot be performed by a fast processing path that performs efficient superscalar processing; a fast operation component configured to provide instructions to a superscalar processor in a manner that allows parallel execution of the instructions by multiple functional units of the superscalar processor; a slow operation component configured to perform database operations on data within a batch that is not stored in the normalized data representation; and a result processing component configured to gather results from the fast operation component and slow operation component and return the results to an operation requestor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
Specification