SM4 acceleration processors, methods, systems, and instructions
First Claim
Patent Images
1. A processor comprising:
- a plurality of packed data registers;
a decoder to decode an instruction, the instruction to indicate one or more source packed data operands, the one or more source packed data operands to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values; and
an execution unit including at least some circuitry coupled with the decoder and coupled with the plurality of the packed data registers, the execution unit, in response to the instruction, to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination packed data register of the plurality of packed data registers that is to be indicated by the instruction.
1 Assignment
0 Petitions
Accused Products
Abstract
A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate one or more source packed data operands. The one or more source packed data operands are to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values. The processor also includes an execution unit coupled with the decode unit and the plurality of the packed data registers. The execution unit, in response to the instruction, is to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination storage location that is to be indicated by the instruction.
-
Citations
24 Claims
-
1. A processor comprising:
-
a plurality of packed data registers; a decoder to decode an instruction, the instruction to indicate one or more source packed data operands, the one or more source packed data operands to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values; and an execution unit including at least some circuitry coupled with the decoder and coupled with the plurality of the packed data registers, the execution unit, in response to the instruction, to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination packed data register of the plurality of packed data registers that is to be indicated by the instruction. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method in a processor comprising:
-
receiving an instruction at a decoder of the processor, the instruction indicating one or more source packed data operands, the one or more source packed data operands having four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values; and accessing at least one of the one or more source packed data operands from a packed data register of the processor; and storing, with an execution unit of the processor that includes at least some circuitry, four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination packed data register in response to the instruction, the destination packed data register indicated by the instruction. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A system to process instructions comprising:
-
an interconnect; a processor coupled with the interconnect, the processor having a set of packed data registers, the processor to receive an instruction that is to indicate one or more source packed data operands, the one or more source packed data operands to have four 32-bit results of four prior cryptographic rounds, and four 32-bit values, the processor, in response to the instruction, to store, with an execution unit of the processor that includes at least some circuitry, four 32-bit results of four immediately subsequent and sequential cryptographic rounds in a destination packed data register of the processor that is to be indicated by the instruction, wherein the cryptographic rounds are those of a cryptographic algorithm that has a non-linear substitution function and a linear substitution function to perform the following operations on a value (B);
BXOR(B<
<
<
2)XOR(B<
<
<
10)XOR(B<
<
<
18)XOR(B<
<
<
24),where <
<
<
represents a left rotate and XOR represents an exclusive OR; anda dynamic random access memory (DRAM) coupled with the interconnect. - View Dependent Claims (22)
-
-
23. An article of manufacture comprising a non-transitory machine-readable storage medium, the non-transitory machine-readable storage medium storing instructions including an instruction,
the instruction to indicate four 32-bit round keys of four prior key expansion rounds and four 32-bit key generation constants of a cryptographic algorithm, wherein the cryptographic algorithm defines system parameter segments including, if expressed in hexadecimal notation, a3b1bac6, 56aa3350, 677d9197, and b27022dc, and the instruction if executed by a machine is to cause the machine to perform operations comprising: storing, with an execution unit of the machine that includes at least some hardware, a result packed data in a destination packed data register of the machine that is to be indicated by the instruction, the result packed data to include four 32-bit round keys of four immediately subsequent and sequential SM4 key expansion rounds. - View Dependent Claims (24)
Specification