APPARATUS AND METHOD FOR COMPRESSION CODING FOR ARTIFICIAL NEURAL NETWORK

US 20190332945A1
Filed: 07/10/2019
Published: 10/31/2019
Est. Priority Date: 01/20/2016
Status: Active Grant

First Claim

Patent Images

1-27. -27. (canceled)

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A compression coding apparatus for artificial neural network, including memory interface unit, instruction cache, controller unit and computing unit, wherein the computing unit is configured to perform corresponding operation to data from the memory interface unit according to instructions of controller unit; the computing unit mainly performs three steps operation: step one is to multiply input neuron by weight data; step two is to perform adder tree computing and add the weighted output neuron obtained in step one level-by-level via adder tree, or add bias to output neuron to get biased output neuron; step three is to perform activation function operation to get final output neuron. The present disclosure also provides a method for compression coding of multi-layer neural network.

4 Citations

53 Claims

1-27. -27. (canceled)

28. A neural network processor, comprising:
- a floating-point number converter configured to;
  
  receive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length; and
  
  a computing unit configured to;
  
  receive the first input neuron data, andcalculate first output neuron data based on the first input neuron and the second weight values.
- View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
- - 29. The neural network processor of claim 28, wherein the computing unit is further configured to calculate one or more weight gradients to update the one or more first weight values.
  - 30. The neural network processor of claim 28, further comprisinga weight cache configured to store the one or more first weight values, anda neuron data cache configured to store the first input neuron data.
  - 31. The neural network processor of claim 29, wherein the computing unit includes:
    - one or more multipliers configured to multiply a first portion of the first neuron data by a second portion of the first neuron data;
      
      one or more adders configured to add multiplication results output from the one or more multipliers; and
      
      one or more activation units configured to apply an activation function to addition results output from the one or more adders.
  - 32. The neural network processor of claim 31, wherein the computing unit includes:
    - one or more pooling components configured to perform a pooling operation to activation results output from the activation units.
  - 33. The neural network processor of claim 29, wherein the computing unit is further configured to:
    - calculate the one or more weight gradients based on output gradients; and
      
      update the first weight values based on the calculated weight gradients.
  - 34. The neural network processor of claim 28, wherein the floating-point number converter is further configured to:
    - convert the first input neuron data to second input neuron data of the second bit length; and
      
      convert the first output neuron data to second output neuron data of the second bit length.
  - 35. The neural network processor of claim 28, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second sign bit, a second exponent field, and a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.
  - 36. The neural network processor of claim 35, wherein a bit length of the second sign bit is the same as a bit length of the first sign bit.
  - 37. The neural network processor of claim 35, wherein a bit length of the second exponent field is determined based on a maximum value of the first exponent fields of the first weight values and a minimum value of the first exponent fields of the first weight values.
  - 38. The neural network processor of claim 28, further comprisinga controller unit configured to receive an instruction and decode the instruction into one or more micro-instructions;
    - a direct memory access module configured to exchange data with an external memory;
      
      an instruction cache configured to store instructions received from the direct memory access module; and
      
      an output data cache configured to store the first output neuron data and the second output neuron data.

39. A method for processing neural network data, comprising:
- receiving, by a floating-point number converter, one or more first weight values of a first bit length and first input neuron data;
  
  converting, by the floating-point number converter, the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length;
  
  receiving, by a computing unit, the first input neuron data; and
  
  calculating, by the computing unit, first output neuron data based on the first input neuron and the second weight values.
- View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49)
- - 40. The method of claim 39, further comprising calculating, by the computing unit, one or more weight gradients to update the one or more first weight values.
  - 41. The method of claim 39, further comprising:
    - storing, by a weight cache, the one or more first weight values, andstoring, by a neuron data cache, the first input neuron data.
  - 42. The method of claim 40, further comprising:
    - multiplying, by one or more multipliers, a first portion of the first neuron data by a second portion of the first neuron data;
      
      adding, by one or more adders, multiplication results output from the one or more multipliers; and
      
      applying, by one or more activation units, an activation function to addition results output from the one or more adders.
  - 43. The method of claim 42, further comprising performing, by one or more pooling components, a pooling operation to activation results output from the activation units.
  - 44. The method of claim 40, further comprising:
    - calculating, by the computing unit, the one or more weight gradients based on output gradients; and
      
      updating, by the computing unit, the first weight values based on the calculated weight gradients.
  - 45. The method of claim 39, further comprising:
    - converting, by the floating-point number converter, the first input neuron data to second input neuron data of the second bit length; and
      
      converting, by the floating-point number converter, the first output neuron data to second output neuron data of the second bit length.
  - 46. The method of claim 39,wherein each of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andwherein each of the second weight values includes a second sign bit, a second exponent field, and a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.
  - 47. The method of claim 46, wherein a bit length of the second sign bit is the same as a bit length of the first sign bit.
  - 48. The method of claim 46, further comprising determining a bit length of the second exponent field based on a maximum value of the first exponent fields of the first weight values and a minimum value of the first exponent fields of the first weight values.
  - 49. The method of claim 39, further comprising:
    - receiving, by a controller unit, an instruction and decode the instruction into one or more micro-instructions;
      
      exchanging, by a direct memory access module, data with an external memory;
      
      storing, by an instruction cache, instructions received from the direct memory access module; and
      
      storing, by an output data cache, the first output neuron data and the second output neuron data.

50. An apparatus for neural network processing, comprisingan input/output (I/O) interface configured to exchange data with peripheral devices;
- a central processing unit (CPU) configured to process the data received via the I/O interface;
  
  a neural network processor configured to process at least a portion of the received data, wherein the neural network processor includes;
  
  a floating-point number converter configured to;
  
  receive one or more first weight values of a first bit length and first input neuron data, andconvert the one or more first weight values to one or more second weight values of a second bit length, wherein the second bit length is less than the first bit length, anda computing unit configured to;
  
  receive the first input neuron data,calculate the first output neuron data based on the first input neuron data and the second weight values, and calculate one or more weight gradients based on the second weight values; and
  
  a memory configured to store the first weight values and input data that includes the first neuron data.
- View Dependent Claims (51, 52, 53)
- - 51. The apparatus of claim 50, wherein the computing unit is further configured to calculate the one or more weight gradients based on output gradients and the second weight values.
  - 52. The apparatus of claim 50, wherein the floating-point number converter is further configured toconvert the first input neuron data to second input neuron data of the second bit length;
    - andconvert the first output neuron data to second output neuron data of the second bit length.
  - 53. The apparatus of claim 50, whereineach of the first weight values includes a first sign bit, a first exponent field, and a first mantissa field;
    - andeach of the second weight values includes a second mantissa field, wherein a bit length of the second mantissa field is less than a bit length of the first mantissa field.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cambricon Technologies Corporation Limited
Original Assignee
Cambricon Technologies Corporation Limited
Inventors
CHEN, Tianshi, LIU, Shaoli, GUO, Qi, CHEN, Yunji

Granted Patent

US 10,726,336 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 2207/4824   Neural networks

G06F 7/4876   Multiplying

G06N 3/00   Computing arrangements base...

G06N 3/063   using electronic means

G06N 3/084   Backpropagation, e.g. using...

APPARATUS AND METHOD FOR COMPRESSION CODING FOR ARTIFICIAL NEURAL NETWORK

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

4 Citations

53 Claims

Specification

Solutions

Use Cases

Quick Links

APPARATUS AND METHOD FOR COMPRESSION CODING FOR ARTIFICIAL NEURAL NETWORK

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

4 Citations

53 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links