Method to Map Convolutional Layers of Deep Neural Network on a Plurality of Processing Elements with SIMD Execution Units, Private Memories, and Connected as a 2D Systolic Processor Array

US 20200134105A1
Filed: 10/31/2018
Published: 04/30/2020
Est. Priority Date: 10/31/2018
Status: Active Application

First Claim

Patent Images

1. A method for improving performance of a predefined Deep Neural Network (DNN) convolution processing on a computing device, the method comprising:

inputting parameters as input data into a processor on a computer that formalizes a design space exploration of a convolution mapping on a predefined computer architecture that will execute the predefined convolution processing, wherein the parameters are predefined as guided by a specification for the predefined convolution processing to be implemented by the convolution mapping and by a microarchitectural specification for the processor that will execute the predefined convolution processing; and

calculating, by the processor, performance metrics for executing the predefined convolution processing on the computing device, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the predefined convolution processing.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for improving performance of a predefined Deep Neural Network (DNN) convolution processing on a computing device includes inputting parameters, as input data into a processor on a computer that formalizes a design space exploration of a convolution mapping, on a predefined computer architecture that will execute the predefined convolution processing. The parameters are predefined as guided by a specification for the predefined convolution processing to be implemented by the convolution mapping and by a microarchitectural specification for the processor that will execute the predefined convolution processing. The processor calculates performance metrics for executing the predefined convolution processing on the computing device, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the predefined convolution processing.

Citations

20 Claims

1. A method for improving performance of a predefined Deep Neural Network (DNN) convolution processing on a computing device, the method comprising:
- inputting parameters as input data into a processor on a computer that formalizes a design space exploration of a convolution mapping on a predefined computer architecture that will execute the predefined convolution processing, wherein the parameters are predefined as guided by a specification for the predefined convolution processing to be implemented by the convolution mapping and by a microarchitectural specification for the processor that will execute the predefined convolution processing; and
  
  calculating, by the processor, performance metrics for executing the predefined convolution processing on the computing device, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the predefined convolution processing.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the possible convolution mappings are mappings onto a predetermined accelerator architecture configuration.
  - 3. The method of claim 2, further comprising determining an optimal configuration for implementing the predefined convolution processing.
  - 4. The method of claim 1, further comprising:
    - receiving input data defining one or more constraints; and
      
      identifying invalid convolution mapping options based on the constraints.
  - 5. The method of claim 1, further comprising determining an optimal convolution mapping.
  - 6. The method of claim 1, as implemented on a computer different from the computing device that will execute the predefined convolution processing.
  - 7. The method of claim 6, as implemented on one of:
    - a server remote from the computing device; and
      
      as a cloud service.
  - 8. The method of claim 1, as implemented as a software tool on the computing device that will execute the predefined convolution processing.
  - 9. The method of claim 1, as embodied as a set of machine-readable instructions on a non-transitory memory device.

10. A method for exploring a design space for mapping convolutional layers of a Deep Neural Network (DNN) onto a plurality of processing elements connected as a 2-dimensional (2D) systolic processor array, the method comprising:
- inputting parameter values into a processor on a computer from a microarchitecture specification that defines configuration aspects of the processing elements;
  
  inputting parameter values into the processor from a specification that defines a convolutional processing; and
  
  calculating, by the processor, performance metrics for executing the convolution processing on the 2D systolic processor array, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the predefined convolution processing.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The method of claim 10, further comprising determining an optimal configuration for implementing the predefined convolution processing.
  - 12. The method of claim 10, further comprising:
    - receiving data for one or more constraints; and
      
      identifying invalid convolution mapping options based on the constraints.
  - 13. The method of claim 10, further comprising determining an optimal convolution mapping.
  - 14. The method of claim 10, as implemented on a computer different from a computing device comprising the 2D systolic processor array that will execute the predefined convolution processing.
  - 15. The method of claim 10, as implemented on a computer different from a computing device comprising the 2D systolic processor array that will execute the predefined convolution processing.
  - 16. The method of claim 15, as implemented on one of:
    - a server remote from the computing device; and
      
      as a cloud service.
  - 17. The method of claim 10, as implemented as a software tool on a computing device comprising the 2D systolic processor array that will execute the predefined convolution processing.
  - 18. The method of claim 10, as embodied as a set of machine-readable instructions on a non-transitory memory device.

19. An apparatus, comprising:
- a processor; and
  
  a memory device accessible by the processor, the memory device storing a set of instructions that permit the processor to execute a method of optimizing a mapping of convolutional layers of a Deep Neural Network (DNN) onto a plurality of processing elements connected as a 2-dimensionsl (2D) systolic processor array, the method comprising;
  
  inputting parameter values into a processor on a computer from a microarchitecture specification that defines configuration aspects of the processing elements;
  
  inputting parameter values into the processor from a specification that defines a convolution processing;
  
  calculating, by the processor, performance metrics for executing the convolution processing on the 2D systolic processor array, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the convolution processing;
  
  inputting one or more constraints that permit the processor to eliminate invalid design choices; and
  
  determining an optimal mapping onto the 3D systolic processor array for the convolution processing.
- View Dependent Claims (20)
- - 20. The apparatus of claim 19, wherein the method is implemented as a software tool that automatically configures an optimal configuration for performing the convolution processing.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
CHEN, Chia-Yu, Choi, Jungwook, Gopalakrishnan, Kailash, Srinivasan, Vijayalakshmi, Venkataramani, Swagath, Zhang, Jintao

Application Number

US16/177,017
Publication Number

US 20200134105A1
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 2111/04   Constraint-based CAD

G06F 30/3323   using formal methods, e.g. ...

G06N 3/04   Architecture, e.g. intercon...

G06N 3/044   Recurrent networks, e.g. Ho...

G06N 3/045   Combinations of networks

G06N 3/063   using electronic means

G06N 3/082   modifying the architecture,...

G06N 3/105   Shells for specifying net l...

Method to Map Convolutional Layers of Deep Neural Network on a Plurality of Processing Elements with SIMD Execution Units, Private Memories, and Connected as a 2D Systolic Processor Array

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method to Map Convolutional Layers of Deep Neural Network on a Plurality of Processing Elements with SIMD Execution Units, Private Memories, and Connected as a 2D Systolic Processor Array

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links