PROCESSING METHOD AND ACCELERATING DEVICE
First Claim
Patent Images
1. A data compression method, comprising:
- performing coarse-grained pruning on weights of a neural network, which includes;
selecting M weights from the neural network through a sliding window, and setting all or part of the M weights to 0 when the M weights meet a preset condition, where the M is an positive integer greater than 0;
performing a first retraining on the neural network, where the weight which has been set to 0 in the retraining process remains 0; and
quantizing the weights of the neural network, which includes;
grouping the weights of the neural network;
performing a clustering operation on each group of weights by using a clustering algorithm, computing a center weight of each class, and replacing all the weights in each class by the center weights.
1 Assignment
0 Petitions
Accused Products
Abstract
The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption.
1 Citation
20 Claims
-
1. A data compression method, comprising:
-
performing coarse-grained pruning on weights of a neural network, which includes;
selecting M weights from the neural network through a sliding window, and setting all or part of the M weights to 0 when the M weights meet a preset condition, where the M is an positive integer greater than 0;
performing a first retraining on the neural network, where the weight which has been set to 0 in the retraining process remains 0; andquantizing the weights of the neural network, which includes;
grouping the weights of the neural network;
performing a clustering operation on each group of weights by using a clustering algorithm, computing a center weight of each class, and replacing all the weights in each class by the center weights. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14)
-
-
15. A data compression device, comprising:
-
a memory configured to store an operation instruction; and a processor configured to; perform coarse-grained pruning on weights of a neural network, which includes;
selecting M weights from the neural network through a sliding window, andset all or part of the M weights to 0 when the M weights meet a preset condition, where the M is an positive integer greater than 0;
performing a first retraining on the neural network, where the weight which has been set to 0 in the retraining process remains 0; andquantize the weights of the neural network, wherein the processor is further configured to; group the weights of the neural network; perform a clustering operation on each group of weights by using a clustering algorithm, compute a center weight of each class, and replace all the weights in each class by the center weights. - View Dependent Claims (13, 16, 17, 18, 19)
-
-
20. An electronic device, comprising:
a data compression device that includes; a memory configured to store an operation instruction; and a processor configured to; perform coarse-grained pruning on weights of a neural network, which includes;
selecting M weights from the neural network through a sliding window, andset all or part of the M weights to 0 when the M weights meet a preset condition, where the M is an positive integer greater than 0;
performing a first retraining on the neural network, where the weight which has been set to 0 in the retraining process remains 0; andquantize the weights of the neural network, wherein the processor is further configured to; group the weights of the neural network; perform a clustering operation on each group of weights by using a clustering algorithm, compute a center weight of each class, and replace all the weights in each class by the center weights.
Specification