Apparatus and method for compression coding for artificial neural network

US 10,402,725 B2
Filed: 07/20/2018
Issued: 09/03/2019
Est. Priority Date: 01/20/2016
Status: Active Grant

First Claim

Patent Images

1. A neural network processor, comprising:

a floating-point number converter configured toreceive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and

a computing unit configured toreceive the first input neuron data from the neuron data cache,calculate first output neuron data based on the first input neuron data and the second weight values, andcalculate one or more weight gradients to update the one or more first weight values.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A compression coding apparatus for artificial neural network, including memory interface unit, instruction cache, controller unit and computing unit, wherein the computing unit is configured to perform corresponding operation to data from the memory interface unit according to instructions of controller unit; the computing unit mainly performs three steps operation: step one is to multiply input neuron by weight data; step two is to perform adder tree computing and add the weighted output neuron obtained in step one level-by-level via adder tree, or add bias to output neuron to get biased output neuron; step three is to perform activation function operation to get final output neuron. The present disclosure also provides a method for compression coding of multi-layer neural network.

1 Citation

23 Claims

1. A neural network processor, comprising:
- a floating-point number converter configured toreceive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and
  
  a computing unit configured toreceive the first input neuron data from the neuron data cache,calculate first output neuron data based on the first input neuron data and the second weight values, andcalculate one or more weight gradients to update the one or more first weight values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The neural network processor of claim 1, further comprising:
    - a weight cache configured to store the one or more first weight values, anda neuron data cache configured to store the first input neuron data.
  - 3. The neural network processor of claim 1, wherein the computing unit includes:
    - one or more multipliers configured to multiply a first portion of the first neuron data by a second portion of the first neuron data;
      
      one or more adders configured to add multiplication results output from the one or more multipliers; and
      
      one or more activation units configured to apply an activation function to addition results output from the one or more adders.
  - 4. The neural network processor of claim 3, wherein the computing unit includes:
    - one or more pooling components configured to perform a pooling operation to activation results output from the activation units.
  - 5. The neural network processor of claim 1, wherein the computing unit is further configured tocalculate the one or more weight gradients based on output gradients;
    - andupdate the first weight values based on the calculated weight gradients.
  - 6. The neural network processor of claim 1, wherein the floating-point number converter is further configured toconvert the first input neuron data to second input neuron data of the second bit length;
    - andconvert the first output neuron data to second output neuron data of the second bit length.
  - 7. The neural network processor of claim 1, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.
  - 8. The neural network processor of claim 1, further comprisinga controller unit configured to receive an instruction and decode the instruction into one or more micro-instructions;
    - a direct memory access module configured to exchange data with an external memory;
      
      an instruction cache configured to store instructions received from the direct memory access module; and
      
      an output data cache configured to store the first output neuron data and the second output neuron data.

9. A method for compression coding in a neural network, comprising:
- receiving one or more first weight values of a first bit length;
  
  receiving first input neuron data;
  
  converting the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length;
  
  calculating first output neuron data based on the first input neuron data and the second weight values; and
  
  calculating one or more weight gradients to update the one or more first weight values.
- View Dependent Claims (10, 11, 12, 13)
- - 10. The method of claim 9, further comprising:
    - multiplying a first portion of the first neuron data by a second portion of the first neuron data;
      
      adding multiplication results output from the one or more multipliers;
      
      applying an activation function to addition results output from the one or more adders; and
      
      performing a pooling operation to activation results output from the activation units.
  - 11. The method of claim 9, further comprising:
    - calculating the one or more weight gradients based on output gradients; and
      
      updating the first weight values based on the calculated weight gradients.
  - 12. The method of claim 9, further comprising:
    - converting the first input neuron data to second input neuron data of the second bit length; and
      
      converting the first output neuron data to second output neuron data of the second bit length.
  - 13. The method of claim 9, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.

14. A neural network processor, comprising:
- a floating-point number converter configured toreceive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and
  
  a computing unit configured toreceive the first input neuron data from the neuron data cache,calculate first output neuron data based on the first input neuron data and the second weight values, andcalculate one or more weight gradients based on the second weight values.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The neural network processor of claim 14, further comprising:
    - a weight cache configured to store the one or more first weight values, anda neuron data cache configured to store the first input neuron data.
  - 16. The neural network processor of claim 14, wherein the computing unit includes:
    - one or more multipliers configured to multiply a first portion of the first neuron data by a second portion of the first neuron data;
      
      one or more adders configured to add multiplication results output from the one or more multipliers;
      
      one or more activation units configured to apply an activation function to addition results output from the one or more adders; and
      
      one or more pooling components configured to perform a pooling operation to activation results output from the activation units.
  - 17. The neural network processor of claim 14, wherein the computing unit is further configured to calculate the one or more weight gradients based on output gradients and the second weight values.
  - 18. The neural network processor of claim 14, wherein the floating-point number converter is further configured toconvert the first input neuron data to second input neuron data of the second bit length;
    - andconvert the first output neuron data to second output neuron data of the second bit length.
  - 19. The neural network processor of claim 14, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.

20. An apparatus for neural network processing, comprisingan input/output (I/O) interface configured to exchange data with peripheral devices;
- a central processing unit (CPU) configured to process the data received via the I/O interface;
  
  a neural network processor configured to process at least a portion of the received data, wherein the neural network processor includes;
  
  a floating-point number converter configured to;
  
  receive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length, anda computing unit configured to;
  
  receive the first input neuron data,calculate the first output neuron data based on the first input neuron data and the second weight values, andcalculate one or more weight gradient to update the one or more first weight values; and
  
  a memory configured to store the first weight values and input data that includes the first neuron data.
- View Dependent Claims (21, 22, 23)
- - 21. The apparatus of claim 20, wherein the computing unit is further configured tocalculate the one or more weight gradients based on output gradients;
    - andupdate the first weight values based on the calculated weight gradients.
  - 22. The apparatus of claim 20, wherein the floating-point number converter is further configured toconvert the first input neuron data to second input neuron data of the second bit length;
    - andconvert the first output neuron data to second output neuron data of the second bit length.
  - 23. The apparatus of claim 20, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cambricon Technologies Corporation Limited
Original Assignee
Cambricon Technologies Corporation Limited
Inventors
Chen, Tianshi, Liu, Shaoli, Guo, Qi, Chen, Yunji
Primary Examiner(s)
Vincent, David R

Application Number

US16/041,160
Publication Number

US 20180330239A1
Time in Patent Office

410 Days
Field of Search

706 15, 706 45
US Class Current
CPC Class Codes

G06F 2207/4824   Neural networks

G06F 7/4876   Multiplying

G06N 3/00   Computing arrangements base...

G06N 3/063   using electronic means

G06N 3/084   Backpropagation, e.g. using...

Apparatus and method for compression coding for artificial neural network

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

1 Citation

23 Claims

Specification

Use Cases

Quick Links

Others

Apparatus and method for compression coding for artificial neural network

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

1 Citation

23 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others