Dynamic precision management for integer deep learning primitives
First Claim
1. A graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising:
- compute unit including a hardware logic unit having dynamic precision fixed-point logic;
a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and
a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow.
1 Assignment
0 Petitions
Accused Products
Abstract
One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic; a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow.
7 Citations
20 Claims
-
1. A graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising:
-
compute unit including a hardware logic unit having dynamic precision fixed-point logic; a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A data processing system comprising:
one or more processors including at least one graphics processor, the at least one graphics processor including; a compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation on the set of dynamic fixed-point tensors, the dynamic precision manager to prevent an arithmetic overflow during the compute operation. - View Dependent Claims (10, 11, 12, 13, 14)
-
15. An electronic device comprising:
one or more processors including at least one graphics processor, the at least one graphics processor including; a compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation on the set of dynamic fixed-point tensors, the dynamic precision manager to prevent an arithmetic overflow during the compute operation, wherein to perform a matrix arithmetic operation on the set of dynamic fixed-point tensors includes to; receive an input tensor associated with the matrix arithmetic operation; divide the input tensor into multiple blocks, the multiple blocks having different fixed-point precisions; determine a shared exponent for each of the multiple blocks; convert each of the multiple blocks into a dynamic fixed-point format using the shared exponent for each block; store metadata for the multiple blocks to indicate a data format and shared exponent for the multiple blocks; and perform the matrix arithmetic operation on the divided dynamic fixed-point input tensors. - View Dependent Claims (16, 17, 18, 19, 20)
Specification