APPARATUS AND METHOD FOR COMPRESSION CODING FOR ARTIFICIAL NEURAL NETWORK
First Claim
1. A neural network processor, comprising:
- a floating-point number converter configured toreceive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and
a computing unit configured toreceive the first input neuron data from the neuron data cache,calculate first output neuron data based on the first input neuron data and the second weight values, andcalculate one or more weight gradients to update the one or more first weight values.
1 Assignment
0 Petitions
Accused Products
Abstract
A compression coding apparatus for artificial neural network, including memory interface unit, instruction cache, controller unit and computing unit, wherein the computing unit is configured to perform corresponding operation to data from the memory interface unit according to instructions of controller unit; the computing unit mainly performs three steps operation: step one is to multiply input neuron by weight data; step two is to perform adder tree computing and add the weighted output neuron obtained in step one level-by-level via adder tree, or add bias to output neuron to get biased output neuron; step three is to perform activation function operation to get final output neuron. The present disclosure also provides a method for compression coding of multi-layer neural network.
-
Citations
24 Claims
-
1. A neural network processor, comprising:
-
a floating-point number converter configured to receive one or more first weight values of a first bit length and first input neuron data, and convert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and a computing unit configured to receive the first input neuron data from the neuron data cache, calculate first output neuron data based on the first input neuron data and the second weight values, and calculate one or more weight gradients to update the one or more first weight values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for compression coding in a neural network, comprising:
-
receiving one or more first weight values of a first bit length; receiving first input neuron data; converting the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; calculating first output neuron data based on the first input neuron data and the second weight values; and calculating one or more weight gradients to update the one or more first weight values. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A neural network processor, comprising:
-
a floating-point number converter configured to receive one or more first weight values of a first bit length and first input neuron data, and convert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and a computing unit configured to receive the first input neuron data from the neuron data cache, calculate first output neuron data based on the first input neuron data and the second weight values, and calculate one or more weight gradients based on the second weight values. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. An apparatus for neural network processing, comprising
an input/output (I/O) interface configured to exchange data with peripheral devices; -
a central processing unit (CPU) configured to process the data received via the I/O interface; a neural network processor configured to process at least a portion of the received data, wherein the neural network processor includes; a floating-point number converter configured to; receive one or more first weight values of a first bit length and first input neuron data, and convert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length, and a computing unit configured to; receive the first input neuron data, calculate the first output neuron data based on the first input neuron data and the second weight values, and calculate one or more weight gradient to update the one or more first weight values; and a memory configured to store the first weight values and input data that includes the first neuron data. - View Dependent Claims (21, 22, 23)
-
-
24-27. -27. (canceled)
Specification