LOW RANK MATRIX COMPRESSION
First Claim
Patent Images
1. An apparatus comprising a processor to:
- detect one or more independent rows of a matrix comprising weights of a neural network;
determine a scalar associated with each of the one or more independent rows of the matrix;
encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
apply delta compression to compress the encoded weight data; and
store the encoded weight data in a memory.
1 Assignment
0 Petitions
Accused Products
Abstract
In an example, an apparatus comprises logic, at least partially including hardware logic, to implement a lossy compression algorithm which utilizes a data transform and quantization process to compress data in a convolutional neural network (CNN) layer. Other embodiments are also disclosed and claimed.
210 Citations
20 Claims
-
1. An apparatus comprising a processor to:
-
detect one or more independent rows of a matrix comprising weights of a neural network; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; and store the encoded weight data in a memory. - View Dependent Claims (2, 3, 4, 5)
-
-
6. (canceled)
-
7. (canceled)
-
8. (canceled)
-
9. (canceled)
-
10. (canceled)
-
11. A method, comprising:
-
detecting one or more independent rows of a matrix comprising weights of a neural network; determining a scalar associated with each of the one or more independent rows of the matrix; encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implementing a delta compression algorithm to compress the encoded weight data; and storing the encoded weight data in a memory. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An electronic device comprising:
-
a computer readable memory; and a processor communicatively coupled to the computer readable memory to; detect one or more independent rows of a matrix comprising weights of a neural network; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implement a delta compression algorithm to compress the encoded weight data; and store the encoded weight data in the computer readable memory. - View Dependent Claims (17, 18, 19, 20)
-
Specification