Normalizing data for fast superscalar processing
First Claim
1. A computer-implemented method for normalizing an in-memory representation of stored data for faster superscalar processing, the method comprising:
- accessing stored data that includes multiple columns, each column having a data type;
selecting a column in the accessed data to determine an appropriate in-memory representation;
determining a data type of the selected column;
determining whether row data associated with the selected column can be normalized based at least in part on the determined data type of the selected column;
determining a fixed size value for a normalized data representation, wherein the fixed sized value is sized such that it can be processed by a superscalar processor as a single instruction; and
upon determining that the row data can be normalized, converting the row data associated with the selected column into the normalized data representation, wherein the normalized data representation is a format that allows performing parallel processing of multiple instances of the normalized data representation using the superscalar processor and that represents multiple data types as the determined fixed size value that can be processed by the superscalar processor in the single instruction,wherein the preceding steps are performed by at least one processor.
2 Assignments
0 Petitions
Accused Products
Abstract
A data normalization system is described herein that represents multiple data types that are common within database systems in a normalized form that can be processed uniformly to achieve faster processing of data on superscalar CPU architectures. The data normalization system includes changes to internal data representations of a database system as well as functional processing changes that leverage normalized internal data representations for a high density of independently executable CPU instructions. Because most data in a database is small, a majority of data can be represented by the normalized format. Thus, the data normalization system allows for fast superscalar processing in a database system in a variety of common cases, while maintaining compatibility with existing data sets.
16 Citations
18 Claims
-
1. A computer-implemented method for normalizing an in-memory representation of stored data for faster superscalar processing, the method comprising:
-
accessing stored data that includes multiple columns, each column having a data type; selecting a column in the accessed data to determine an appropriate in-memory representation; determining a data type of the selected column; determining whether row data associated with the selected column can be normalized based at least in part on the determined data type of the selected column; determining a fixed size value for a normalized data representation, wherein the fixed sized value is sized such that it can be processed by a superscalar processor as a single instruction; and upon determining that the row data can be normalized, converting the row data associated with the selected column into the normalized data representation, wherein the normalized data representation is a format that allows performing parallel processing of multiple instances of the normalized data representation using the superscalar processor and that represents multiple data types as the determined fixed size value that can be processed by the superscalar processor in the single instruction, wherein the preceding steps are performed by at least one processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable storage device comprising instructions for controlling a computer system to normalize an in-memory representation of stored data for faster superscalar processing, wherein the instructions, when executed, cause a processor to perform actions comprising:
-
accessing stored data that includes multiple columns, each column having a data type; selecting a column in the accessed data to determine an appropriate in-memory representation; determining a data type of the selected column; determining whether row data associated with the selected column can be normalized based at least in part on the determined data type of the selected column; determining a fixed size value for a normalized data representation, wherein the fixed sized value is sized such that it can be processed by a superscalar processor as a single instruction and upon determining that the row data can be normalized, converting the row data associated with the selected column into the normalized data representation, wherein the normalized data representation is a format that allows performing parallel processing of multiple instances of the normalized data representation using the superscalar processor and that represents multiple data types as the determined fixed size value that can be processed by the superscalar processor in the single instruction, wherein the preceding steps are performed by at least one processor. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification