Apparatus and method for implementing adjacent, non-unit stride memory access patterns utilizing SIMD instructions
First Claim
1. A method comprising:
- analyzing a source program to detect vectorizable loops having one or more serial code statements that collectively perform adjacent, non-unit stride memory access; and
vectorizing serial code statements of each detected loop to perform adjacent, non-unit stride memory access utilizing SIMD instructions.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus and method for implementing adjacent, single non-unit stride memory access patterns are described. In one embodiment, the method includes compiler analysis of a source program to detect vectorizable loops having serial code statements that collectively perform adjacent, non-unit stride memory access. Once a vectorizable loop containing code statements that collectively perform adjacent, non-unit stride memory access in detected, the compiler vectorizes the serial code statements of the detected loop to perform the adjacent, non-unit stride memory access utilizing SIMD instructions. As such, the compiler repeats the analysis and vectorization for each vectorizable loop within the source program code.
95 Citations
30 Claims
-
1. A method comprising:
-
analyzing a source program to detect vectorizable loops having one or more serial code statements that collectively perform adjacent, non-unit stride memory access; and
vectorizing serial code statements of each detected loop to perform adjacent, non-unit stride memory access utilizing SIMD instructions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer readable storage medium including program instructions that direct a computer to perform one or more operations when executed by a processor, the one or more operations comprising:
-
analyzing a source program to detect vectorizable loops having one or more serial code statements that collectively perform adjacent, non-unit stride memory access; and
vectorizing serial code statements of each detected loop to perform adjacent, non-unit stride memory access utilizing SIMD instructions. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system, comprising:
-
a processor having circuitry to execute instructions;
a system interface coupled to the processor, the system interface to receive source programs, and to provide target optimize programs once compiled from the source program;
a storage device coupled to the processor, having sequences of compiler instructions stored therein, which when executed by the processor cause the processor to;
analyze a source program to detect vectorizable loops having one or more serial code statements that collectively perform adjacent, non-unit stride memory access, and vectorize serial code statements of each detected loop to perform adjacent, non-unit stride memory access utilizing SIMD instructions. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification