Data processing system with register store/load utilizing data packing/unpacking
First Claim
1. A data processing system comprising:
- a memory comprising a plurality of memory locations; and
a central processing unit core comprising at least one register file with a plurality of registers, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a load instruction to retrieve data words from said memory and parse said data words over selected parts of two data registers in said at least one register file, said parse comprising interleaved unpacking the lower and higher half-words of each of said two data words into corresponding pairs of data registers.
1 Assignment
0 Petitions
Accused Products
Abstract
A data processing system (e.g., microprocessor 30) for packing register data while storing it to memory and unpacking data read from memory while loading it into registers using single processor instructions. The system comprises a memory (42) and a central processing unit core (44) with at least one register file (76). The core is responsive to a load instruction (e.g., LDW_BH[U] instruction 184) to retrieve at least one data word from memory and parse the data word over selected parts of at least two data registers in the register file. The core is responsive to a store instruction (e.g., STBH_W instruction 198) to concatenate data from selected parts of at least two data registers into at least one data word and save the data word to memory. The number of data registers is greater than the number of data words parsed into or concatenated from the data registers. Both memory storage space and central processor unit resources are utilized efficiently when working with packed data. A single store or load instruction can perform all of the tasks that used to take several instructions, while at the same time conserving memory space.
-
Citations
39 Claims
-
1. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising at least one register file with a plurality of registers, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a load instruction to retrieve data words from said memory and parse said data words over selected parts of two data registers in said at least one register file, said parse comprising interleaved unpacking the lower and higher half-words of each of said two data words into corresponding pairs of data registers.
-
-
2. The data processing system of claim 1 wherein said load instruction selects sign or zero extend for the parsed data in said at least two data registers.
-
3. The data processing system of claim 1 wherein said at least one register file is two register files, and one pair of said corresponding pairs of data registers is located in one register file and the other pair is located in the other register file.
-
4. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising at least one register file with a plurality of registers, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a load instruction to retrieve data words from said memory and parse said data words over selected parts of two data registers in said at least one register file, said parse comprising unpacking the bytes of each at least one data word into the lower and higher half-words of each of a pair of data registers.
-
-
5. The data processing system of claim 4 wherein said at least one data word is two data words, and said parse comprises unpacking eight bytes from said two data words into corresponding pairs of data registers.
-
6. The data processing system of claim 5 wherein said unpacking of said bytes of said data words is interleaved.
-
7. The data processing system of claim 6 wherein said at least one register file is two register files, and one pair of said corresponding pairs of data registers is located in one register file and the other pair is located in the other register file.
-
8. The data processing system of claim 4 wherein said at least one register file is two register files, and said pair of data registers are an even/odd pair in the same data register file.
-
9. The data processing system of claim 4 wherein said at least one register file is two register files, one of said pair of data registers is located in one register file and the other is located in the other register file, and each of said pair of data registers has the same relative register number.
-
10. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising at least one register file with a plurality of registers, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a store instruction to concatenate data from selected parts of two data registers into one data word and save said one data word to said memory, said concatenate packing the lower bytes of the lower and higher half-words of each of said two data registers into said at least one data word.
-
-
11. The data processing system of claim 10 wherein said two data registers are an even/odd register pair.
-
12. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising at least one register file with a plurality of registers, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a store instruction to concatenate data from selected parts of four data registers into two data words and save said two data words to said memory, said concatenate packing the lower bytes of the lower and higher half-words of each of said four data registers into said two data words.
-
-
13. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising a plurality of A functional units, a plurality of B functional units, an A register file with a plurality of A registers accessed by corresponding register numbers, each A register capable of serving as a source or destination for any A functional unit, a B register file with a plurality of B registers accessed by corresponding register numbers, each B register capable of serving as a source or destination for any B functional unit, a first cross path connected to said A register file and said B functional units permitting any one A register file to be a source for at least one B functional unit, a second cross path connected to said B register file and said A functional units permitting any one B register file to be a source for at least one A functional unit, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a load instruction to retrieve at least one data word from said memory and parse said at least one data word over selected parts of two data registers including an A register having a first register access number and a second B register having said first register access number.
-
-
14. The data processing system of claim 13, wherein said at least one data word comprises a lower data word and an higher data word, and said parse comprises unpacking said lower data word into said A register and said higher data word into said B data register.
-
15. The data processing system of claim 13, wherein said at least one data word comprises a single data word, and said parse comprises unpacking first and second bytes of said single data word into corresponding lower and higher half words of said A register and third and fourth bytes of said single data word into corresponding lower and higher half words of said B register.
-
16. The data processing system of claim 15, wherein said load instruction selects sign or zero extend for the parsed data in said A and B registers.
-
17. The data processing system of claim 13, wherein said at least one data word comprises a single data word, and said parse comprises unpacking first and third bytes of said single data word into corresponding lower and higher half words of said A register and second and fourth bytes of said single data word into corresponding lower and higher half words of said B register.
-
18. The data processing system of claim 17, wherein said load instruction selects sign or zero extend for the parsed data in said A and B registers.
-
19. The data processing system of claim 13, wherein said at least one data word comprises a single data word, and said parse comprises unpacking a lower half word of said single data word into a lower half word of said A register and an higher half word of said single data word into a lower half word of said B register.
-
20. The data processing system of claim 17, wherein said load instruction selects sign or zero extend for the parsed data in said A and B registers.
-
21. The data processing system of claim 13, wherein said at least one data word comprises a single data word, and said concatenate comprises packing first and third bytes said A register into a lower half word of said single data word and first and third bytes of said B register into an higher half word of said single data word.
-
22. The data processing system of claim 13, wherein said parse comprises unpacking a lower half word of a lower data word into a lower half word of said first A register, a higher half word of said lower data word into a lower half word of said second A register, a lower half word of a higher data word into a lower half word of said first B register and a higher half word of said higher data word into a lower half of said second B register.
-
23. The data processing system of claim 22, wherein said load instruction selects sign or zero extend for the parsed data in said first and second A registers and said first and second B registers.
-
24. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising a plurality of A functional units, a plurality of B functional units, an A register file with a plurality of A registers accessed by corresponding register numbers, each A register capable of serving as a source or destination for any A functional unit, a B register file with a plurality of B registers accessed by corresponding register numbers, each B register capable of serving as a source or destination for any B functional unit, a first cross path connected to said A register file and said B functional units permitting any one A register file to be a source for at least one B functional unit, a second cross path connected to said B register file and said A functional units permitting any one B register file to be a source for at least one A functional unit, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a store instruction to concatenate data from selected parts of two data registers including an A register having a first register access number and a second B register having said first register access number into at least one data word.
-
-
25. The data processing system of claim 24, wherein said at least one data word comprises a lower data word and an higher data word, and said concatenate comprises packing said A register into said lower data word and said B register into said higher data word.
-
26. The data processing system of claim 24, wherein said at least one data word comprises a single data word, and said concatenate comprises packing a first byte of said A register into a first byte of said single data word, a third byte of said A register into a third byte of said single data word, a first byte of said B register into a second byte of said single data word and a third byte of said B register into a fourth byte of said single data word.
-
27. The data processing system of claim 24, wherein said at least one data word comprises a single data word, and said concatenate comprises packing a lower half word of said A register into a lower half word of said single data word and a lower half word of said B register into an higher half word of said single data word.
-
28. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising a plurality of A functional units, a plurality of B functional units, an A register file with a plurality of A registers accessed by corresponding register numbers, each A register capable of serving as a source or destination for any A functional unit, a B register file with a plurality of B registers accessed by corresponding register numbers, each B register capable of serving as a source or destination for any B functional unit, a first cross path connected to said A register file and said B functional units permitting any one A register file to be a source for at least one B functional unit, a second cross path connected to said B register file and said A functional units permitting any one B register file to be a source for at least one A functional unit, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a load instruction to retrieve at two data words from said memory and parse said data words over selected parts of four data registers including a first A register having a first even register access number, a second A register having a second odd register access number one more that said first even register access number, a first B register having said first even register access number, and a second B register having said second odd register access number.
-
-
29. The data processing system of claim 28, wherein said parse comprises unpacking a first byte of a lower data word into a lower byte of said first A register, a second byte of said lower data word into a third byte of said first A register, a third byte of said lower data word into a first byte of said second A register, a fourth byte of said lower data word into a third byte of said second A register, a first byte of an higher data word into a first byte of said first B register, a second byte of said higher data word into a third byte of said first B register, a third byte of said higher data word into a first byte of said second B register and a third byte of said higher data word into a third byte of said second B data register.
-
30. The data processing system of claim 29, wherein said load instruction selects sign or zero extend for the parsed data in said first and second A registers and said first and second B registers.
-
31. The data processing system of claim 28, wherein said parse comprises unpacking a first byte of a lower data word into a lower byte of said first A register, a third byte of said lower data word into a third byte of said first A register, a second byte of said lower data word into a first byte of said second A register, a fourth byte of said lower data word into a third byte of said second A register, a first byte of an higher data word into a first byte of said first B register, a third byte of said higher data word into a third byte of said first B register, a second byte of said higher data word into a first byte of said second B register and a third byte of said higher data word into a third byte of said second B data register.
-
32. The data processing system of claim 31, wherein said load instruction selects sign or zero extend for the parsed data in said first and second A registers and said first and second B registers.
-
33. The data processing system of claim 28, wherein said parse comprises unpacking a lower half word of a lower data word into a lower half word of said first A register, a lower half word of a higher data word into a lower half word of said second A register, a higher half word of said lower data word into a lower half word of said first B register and a higher half word of said higher data word into a lower half of said second B register.
-
34. The data processing system of claim 33, wherein said load instruction selects sign or zero extend for the parsed data in said first and second A registers and said first and second B registers.
-
35. A data processing system comprising:
-
a memory comprising a plurality of memory locations; and
a central processing unit core comprising a plurality of A functional units, a plurality of B functional units, an A register file with a plurality of A registers accessed by corresponding register numbers, each A register capable of serving as a source or destination for any A functional unit, a B register file with a plurality of B registers accessed by corresponding register numbers, each B register capable of serving as a source or destination for any B functional unit, a first cross path connected to said A register file and said B functional units permitting any one A register file to be a source for at least one B functional unit, a second cross path connected to said B register file and said A functional units permitting any one B register file to be a source for at least one A functional unit, said core connected to said memory for loading data from and storing data to said memory locations, said core responsive to a store instruction to concatenate data from selected parts of four data registers including a first A register having a first even register access number, a second A register having a second odd register access number one more that said first even register access number, a first B register having said first even register access number, and a second B register having said second odd register access number into two data words.
-
-
36. The data processing system of claim 35, wherein said concatenate comprises packing a first byte of said first A register into a first byte of a lower data word, a third byte of said first A register into a second byte of said lower data word, a first byte of said second A register into a third byte of said lower data word, a third byte of said second A register into a fourth byte of said lower data word, a first byte of said first B register into a first byte of a higher data word, a third byte of said first B register into a second byte of said higher data word, a first byte of said second B register into a third byte of said higher data word and a third byte of said second B register into a fourth byte of said higher data word.
-
37. The data processing system of claim 35, wherein said concatenate comprises packing a first byte of said first A register into a first byte of a lower data word, a third byte of said first A register into a third byte of said lower data word, a first byte of said second A register into a second byte of said lower data word, a third byte of said second A register into a fourth byte of said lower data word, a first byte of said first B register into a first byte of a higher data word, a third byte of said first B register into a third byte of said higher data word, a first byte of said second B register into a second byte of said higher data word and a third byte of said second B register into a fourth byte of said higher data word.
-
38. The data processing system of claim 35, wherein said concatenate comprises packing a lower half word of said first A register into a lower half word of a lower data word, a higher half word of said first A register into a higher half word of said lower data word, a lower half word of said first B register into a lower half word of a higher data word and a lower half word of said second B register into a higher half word of said higher data word.
-
39. The data processing system of claim 35, wherein said concatenate comprises packing a lower half word of said first A register into a lower half word of a lower data word, a higher half word of said first A register into a lower half word of a higher data word, a lower half word of said first B register into a higher half word of said lower data word and a lower half word of said second B register into a higher half word of said higher data word.
Specification