NEURAL NETWORK UNIT WITH NEURAL PROCESSING UNITS DYNAMICALLY CONFIGURABLE TO PROCESS MULTIPLE DATA SIZES

US 20170103302A1
Filed: 04/05/2016
Published: 04/13/2017
Est. Priority Date: 10/08/2015
Status: Active Grant

First Claim

Patent Images

1. A neural network unit, comprising:

a register that holds an indicator that specifies narrow and wide configurations;

a first memory that holds rows of 2N narrow or N wide weight words when the indicator indicates the narrow or wide configuration, respectively;

a second memory that holds rows of 2N narrow or N wide data words when the indicator indicates the narrow or wide configuration, respectively; and

an array of neural processing units (NPU), the array configured as 2N narrow or N wide NPUs and configured to receive the 2N narrow or N wide weight words of rows from the first memory and configured to receive the 2N narrow or N wide data words of rows from the second memory when the indicator indicates the narrow or wide configuration, respectively;

when the indicator indicates the narrow configuration, the 2N NPUs are configured to perform narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories; and

when the indicator indicates the wide configuration, the N NPUs are configured to perform wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A neural network unit. A register holds an indicator that specifies narrow and wide configurations. A first memory holds rows of 2N/N narrow/wide weight words in the narrow/wide configuration. A second memory holds rows of 2N/N narrow/wide data words in the narrow/wide configuration. An array of neural processing units (NPU) is configured as 2N/N narrow/wide NPUs and to receive the 2N/N narrow/wide weight words of rows from the first memory and to receive the 2N/N narrow/wide data words of rows from the second memory in the narrow/wide configuration. In the narrow configuration, the 2N NPUs perform narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories. In the wide configuration, the N NPUs perform wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.

Citations

20 Claims

1. A neural network unit, comprising:
- a register that holds an indicator that specifies narrow and wide configurations;
  
  a first memory that holds rows of 2N narrow or N wide weight words when the indicator indicates the narrow or wide configuration, respectively;
  
  a second memory that holds rows of 2N narrow or N wide data words when the indicator indicates the narrow or wide configuration, respectively; and
  
  an array of neural processing units (NPU), the array configured as 2N narrow or N wide NPUs and configured to receive the 2N narrow or N wide weight words of rows from the first memory and configured to receive the 2N narrow or N wide data words of rows from the second memory when the indicator indicates the narrow or wide configuration, respectively;
  
  when the indicator indicates the narrow configuration, the 2N NPUs are configured to perform narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories; and
  
  when the indicator indicates the wide configuration, the N NPUs are configured to perform wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The neural network unit of claim 1, further comprising:
    - the wide words are twice the width of the narrow words.
  - 3. The neural network unit of claim 1, further comprising:
    - the NPUs perform integer arithmetic operations on the weight words and the data words.
  - 4. The neural network unit of claim 1, further comprising:
    - a second register programmable to specify a location of a binary point of the weight words and the data words.
  - 5. The neural network unit of claim 1, each of the NPUs comprising:
    - an arithmetic unit that performs the arithmetic operations on the weight words and data words received from the first and second memories;
      
      a multiplexed register having an output received by the arithmetic unit and by the multiplexed register of an adjacent NPU;
      
      when the indicator indicates the narrow configuration, the multiplexed registers of the 2N NPUs collectively selectively operate as a 2N-word rotater for a row of the 2N narrow data words received from the second memory; and
      
      when the indicator indicates the wide configuration, the multiplexed registers of the N NPUs collectively selectively operate as an N-word rotater for a row of the N wide data words received from the second memory.
  - 6. The neural network unit of claim 1, further comprising:
    - each of the NPUs comprises an arithmetic unit that performs arithmetic operations on the weight words and data words received from the first and second memories and an accumulator that accumulates results of the arithmetic unit into an accumulated value; and
      
      a plurality of activation function units that perform activation functions on the accumulated values to generate 2N narrow or N wide results when the indicator indicates the narrow or wide configuration, respectively; and
      
      when the indicator indicates the narrow or wide configuration, the 2N narrow or N wide results, respectively, are written back to the first or second memory.
  - 7. The neural network unit of claim 1, further comprising:
    - a program memory that holds program instructions; and
      
      a sequencer that fetches the program instructions from the program memory and executes them to control the neural network unit to perform the arithmetic operations.
  - 8. The neural network unit of claim 7, further comprising:
    - the program memory is writable by architectural instructions of an instruction set architecture of a processor that comprises the neural network unit.
  - 9. The neural network unit of claim 1, further comprising:
    - the first and second memories are writable by architectural instructions of an instruction set architecture of a processor that comprises the neural network unit.
  - 10. The neural network unit of claim 1, further comprising:
    - N is at least 512.

11. A neural network unit, comprising:
- a register that holds an indicator that specifies first and second configurations;
  
  an array of neural processing units (NPU), the array configured as 2N or N NPUs when the indicator indicates the first or second configuration, respectively, each NPU having;
  
  an accumulator having an output;
  
  an arithmetic unit having first, second and third inputs and that performs an operation thereon to generate a result to store in the accumulator, the first input receives the output of the accumulator;
  
  a weight input that is received by the second input to the arithmetic unit; and
  
  a multiplexed register having first, second and third data inputs, an output received by the third input to the arithmetic unit, and a control input that controls selection of the first, second and third data inputs; and
  
  the output of the multiplexed register is also received by the third data input of the multiplexed register and by the second data input of the multiplexed register of an adjacent NPU, the multiplexed registers of the N NPUs collectively operate as an N-word rotater when the control input specifies the second data input, and the multiplexed registers of the N NPUs collectively operate as a 2N-word rotater when the control input specifies the third data input;
  
  a first memory that holds W rows of 2N or N weight words and provides the 2N or N weight words of a row of the W rows to the corresponding weight inputs of the 2N or N NPUs when the indicator indicates the first or second configuration, respectively; and
  
  a second memory that holds D rows of 2N or N data words and provides the 2N or N data words of a row of the D rows to the corresponding first data inputs of the multiplexed register of the 2N or N NPUs when the indicator indicates the first or second configuration, respectively.
- View Dependent Claims (12, 13, 14)
- - 12. The neural network unit of claim 11, further comprising:
    - when the indicator indicates the first configuration;
      
      the 2N data and weight words are narrow words;
      
      the 2N accumulators are narrow accumulators; and
      
      the 2N NPUs are configured to perform arithmetic operations on narrow data and weight words; and
      
      when the indicator indicates the second configuration;
      
      the N data and weight words are wide words;
      
      the N accumulators are wide accumulators; and
      
      the N NPUs are configured to perform arithmetic operations on wide data and weight words.
  - 13. The neural network unit of claim 12, further comprising:
    - the wide words are twice the width in bits of the narrow words.
  - 14. The neural network unit of claim 11, further comprising:
    - a plurality of activation function units that perform activation functions on the results stored in the accumulator to generate 2N narrow or N wide results when the indicator indicates the first or second configuration, respectively; and
      
      when the indicator indicates the first or second configuration, the 2N narrow or N wide results, respectively, are written back to the first or second memory.

15. A method for operating a neural network unit having an array of neural processing units (NPU) configurable as 2N narrow or N wide NPUs and to receive the 2N narrow or N wide weight words of rows from the first memory and to receive the 2N narrow or N wide data words of rows from the second memory when the indicator indicates the narrow or wide configuration, respectively, the method comprising:
- holding, by a register, an indicator that specifies narrow and wide configurations;
  
  holding, by a first memory, rows of 2N narrow or N wide weight words when the indicator indicates the narrow or wide configuration, respectively;
  
  holding, by a second memory, rows of 2N narrow or N wide data words when the indicator indicates the narrow or wide configuration, respectively; and
  
  performing, by the 2N NPU when the indicator indicates the narrow configuration, narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories; and
  
  performing, by the N NPUs when the indicator indicates the wide configuration, wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The method of claim 15, further comprising:
    - said performing the narrow/wide arithmetic operations comprises performing integer arithmetic operations on the weight words and the data words.
  - 17. The method of claim 15, further comprising:
    - programming a second register to specify a location of a binary point of the weight words and the data words.
  - 18. The method of claim 15, further comprising:
    - performing, by an arithmetic unit of each of the NPUs, the arithmetic operations on the weight words and data words received from the first and second memories;
      
      each of the NPUs includes a multiplexed register having an output received by the arithmetic unit and by the multiplexed register of an adjacent NPU;
      
      collectively selectively operating, by the multiplexed registers of the 2N NPUs, as a 2N-word rotater for a row of the 2N narrow data words received from the second memory, when the indicator indicates the narrow configuration; and
      
      collectively selectively operating, by the multiplexed registers of the N NPUs, as an N-word rotater for a row of the N wide data words received from the second memory, when the indicator indicates the wide configuration.
  - 19. The method of claim 15, further comprising:
    - each of the NPUs comprises an arithmetic unit that performs arithmetic operations on the weight words and data words received from the first and second memories and an accumulator that accumulates results of the arithmetic unit into an accumulated value; and
      
      performing, by a plurality of activation function units, activation functions on the accumulated values to generate 2N narrow or N wide results when the indicator indicates the narrow or wide configuration, respectively; and
      
      writing back the 2N narrow or N wide results, respectively, to the first or second memory when the indicator indicates the narrow or wide configuration.
  - 20. The method of claim 15, further comprising:
    - holding, by a program memory, program instructions; and
      
      fetching, by a sequencer, the program instructions from the program memory and executing them to control the neural network unit to perform the arithmetic operations.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Via Alliance Semiconductor Co., Ltd.
Original Assignee
Via Alliance Semiconductor Co., Ltd.
Inventors
HENRY, G. GLENN, PARKS, TERRY

Granted Patent

US 10,353,860 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 1/10   Distribution of clock signa...

G06F 15/82   data or demand driven

G06F 7/483   Computations with numbers r...

G06F 7/49947   Rounding

G06F 9/3001   Arithmetic instructions

G06F 9/30029   Logical and Boolean instruc...

G06F 9/30032   Movement instructions, e.g....

G06F 9/3004   to perform operations on me...

G06F 9/30098   Register arrangements

G06F 9/30101   Special purpose registers

G06F 9/30189   according to execution mode...

G06F 9/321   Program or instruction coun...

G06F 9/38   Concurrent instruction exec...

G06F 9/3836   Instruction issuing, e.g. d...

G06F 9/3854   Instruction completion, e.g...

G06F 9/3867   using instruction pipelines

G06F 9/3877   using a slave processor, e....

G06F 9/3893   controlled in tandem, e.g. ...

G06F 9/44505   Configuring for program ini...

G06N 3/04   Architecture, e.g. intercon...

G06N 3/044 : Recurrent networks, e.g. Ho...

G06N 3/045 : Combinations of networks

G06N 3/063 : using electronic means

G06N 3/065 : Analogue means

G06N 3/08 : Learning methods

G06N 3/088 : Non-supervised learning, e....

View All

NEURAL NETWORK UNIT WITH NEURAL PROCESSING UNITS DYNAMICALLY CONFIGURABLE TO PROCESS MULTIPLE DATA SIZES

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

NEURAL NETWORK UNIT WITH NEURAL PROCESSING UNITS DYNAMICALLY CONFIGURABLE TO PROCESS MULTIPLE DATA SIZES

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links