Processing array data on SIMD multi-core processor architectures
First Claim
1. A computer-implemented method for generating a single instruction, multiple data (SIMD) data structure tailored for processing fast Fourier transforms (FFTs) on a SIMD multi-core processor architecture, comprising:
- receiving input data, wherein the input data is a matrix having m rows and n columns;
converting the input data into a SIMD format to produce converted data for simultaneous processing of s rows of the input data, wherein s is a power of two, wherein m and n are greater than and divisible by s, wherein the converted data includes a sequence of blocks, wherein each block includes s consecutive rows of the input data that are interleaved such that a set of each first element of the s consecutive rows immediately precedes a set of each second element of the s consecutive rows in terms of sequential memory addresses and such that the set of each second element of the s consecutive rows immediately precedes a set of each third element of the s consecutive rows in terms of sequential memory addresses, to produce s interleaved rows; and
storing the converted data in sequential memory addresses, wherein a SIMD operation, comprising an FFT, is performed on the converted data.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are disclosed for converting data into a format tailored for efficient multidimensional fast Fourier transforms (FFTS) on single instruction, multiple data (SIMD) multi-core processor architectures. The technique includes converting data from a multidimensional array stored in a conventional row-major order into SIMD format. Converted data in SIMD format consists of a sequence of blocks, where each block interleaves s rows such that SIMD vector processors may operate on s rows simultaneously. As a result, the converted data in SIMD format enables smaller-sized 1D FFTs to be optimized in SIMD multi-core processor architectures.
-
Citations
19 Claims
-
1. A computer-implemented method for generating a single instruction, multiple data (SIMD) data structure tailored for processing fast Fourier transforms (FFTs) on a SIMD multi-core processor architecture, comprising:
-
receiving input data, wherein the input data is a matrix having m rows and n columns; converting the input data into a SIMD format to produce converted data for simultaneous processing of s rows of the input data, wherein s is a power of two, wherein m and n are greater than and divisible by s, wherein the converted data includes a sequence of blocks, wherein each block includes s consecutive rows of the input data that are interleaved such that a set of each first element of the s consecutive rows immediately precedes a set of each second element of the s consecutive rows in terms of sequential memory addresses and such that the set of each second element of the s consecutive rows immediately precedes a set of each third element of the s consecutive rows in terms of sequential memory addresses, to produce s interleaved rows; and storing the converted data in sequential memory addresses, wherein a SIMD operation, comprising an FFT, is performed on the converted data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer readable storage medium containing a program which, when executed, performs an operation for generating a single instruction, multiple data (SIMD) data structure tailored for processing fast Fourier transforms (FFTs) on a SIMD multi-core processor architecture, comprising:
-
receiving input data, wherein the input data is a matrix having m rows and n columns; converting the input data into a SIMD format by operation of one or more computer processors when executing the program, to produce converted data for simultaneous processing of s rows of the input data, wherein s is a power of two, wherein m and n are greater than and divisible by s, wherein the converted data includes a sequence of blocks, wherein each block includes s consecutive rows of the input data that are interleaved such that a set of each first element of the s consecutive rows immediately precedes a set of each second element of the s consecutive rows in terms of sequential memory addresses and such that the set of each second element of the s consecutive rows immediately precedes a set of each third element of the s consecutive rows in terms of sequential memory addresses, to produce s interleaved rows; and storing the converted data in sequential memory addresses, wherein a SIMD operation, comprising an FFT, is performed on the converted data. - View Dependent Claims (11, 12, 13)
-
-
14. A system, comprising:
-
a processor; and a memory containing a program, which when executed by the processor is configured to perform an operation for generating a single instruction, multiple data (SIMD) data structure tailored for processing fast Fourier transforms (FFTs) on a SIMD multi-core processor architecture, comprising; receiving input data, wherein the input data is a matrix having m rows and n columns; converting the input data into a SIMD format to produce converted data for simultaneous processing of s rows of the input data, wherein s is a power of two, wherein m and n are greater than and divisible by s, wherein the converted data includes a sequence of blocks, wherein each block includes s consecutive rows of the input data that are interleaved such that a set of each first element of the s consecutive rows immediately precedes a set of each second element of the s consecutive rows in terms of sequential memory addresses and such that the set of each second element of the s consecutive rows immediately precedes a set of each third element of the s consecutive rows in terms of sequential memory addresses, to produce s interleaved rows; and storing the converted data in sequential memory addresses, wherein a SIMD operation, comprising an FFT, is performed on the converted data. - View Dependent Claims (15, 16, 17)
-
-
18. A computer-implemented method for generating a single instruction, multiple data (SIMD) data structure tailored for processing fast Fourier transforms (FFTs) on a SIMD multi-core processor architecture, comprising:
-
receiving input data, wherein the input data is a matrix having m rows and n columns, wherein the matrix includes b blocks, and wherein each block includes s consecutive rows, wherein s is a power of two, wherein m and n are greater than and divisible by s; and copying the b blocks into sequential memory addresses to generate the SIMD data structure, wherein a SIMD operation, comprising an FFT, is performed on the generated SIMD data structure, wherein copying a block comprises; copying, into sequential memory addresses, each first element of the s consecutive rows of the block, immediately followed by each second element of the s consecutive rows of the block, immediately followed by each third element of the s consecutive rows of the block, followed by remaining elements of the s consecutive rows in a similar manner, ending with last elements of the s consecutive rows of the block. - View Dependent Claims (19)
-
Specification