Method and apparatus for hardware-accelerated machine learning
First Claim
1. A machine-learning apparatus comprising:
- a feature extractor for a convolutional neural network, wherein the feature extractor is deployed on a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processing unit (GPU), and (3) a chip multi-processor (CMP), wherein the member comprises a plurality of data processing engines arranged as a multi-functional pipeline through which data is streamed, the pipelined data processing engines configured for operation in parallel with each other;
each pipelined data processing engine being configured to (1) receive streaming data and perform a processing operation on the received streaming data, and (2) be responsive to a control instruction that defines whether that pipelined data processing engine is an activated data processing engine or a deactivated data processing engine, wherein an activated data processing engine is configured to perform its processing operation on streaming data received thereby, and wherein a deactivated data processing engine remains in the pipeline but does not perform its processing operation on streaming data received thereby, the multi-functional pipeline thereby being configured to provide a plurality of different pipeline functions in response to control instructions that are configured to selectively activate and deactivate the pipelined data processing engines, each pipeline function being the combined functionality of each activated pipelined data processing engine in the pipeline at a given time;
wherein each of a plurality of the data processing engines is configured as a convolution engine that convolves first data with second data via correlation logic;
wherein each of another plurality of the data processing engines is configured as a data reduction engine that performs a data reduction operation on data received thereby; and
wherein the multi-functional pipeline is configured to activate a plurality of the convolution engines and a plurality of the data reduction engines at the same time in response to control instructions in order to configure the multi-functional pipeline as the feature extractor for the convolutional neural network.
1 Assignment
0 Petitions
Accused Products
Abstract
A multi-functional data processing pipeline for use with machine learning is disclosed. The multi-functional pipeline may comprise a plurality of pipelined data processing engines, the plurality of pipelined data processing engines being configured to perform processing operations, and the pipelined data processing engines can include correlation logic. The multi-functional pipeline can be configured to controllably activate or deactivate each of the pipelined data processing engines in the pipeline in response to control instructions and thereby define a function for the pipeline, each pipeline function being the combined functionality of each activated pipelined data processing engine in the pipeline. In example embodiments, such pipelines can be used to accelerate convolutional layers in machine-learning technology such as convolutional neural networks.
636 Citations
50 Claims
-
1. A machine-learning apparatus comprising:
-
a feature extractor for a convolutional neural network, wherein the feature extractor is deployed on a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processing unit (GPU), and (3) a chip multi-processor (CMP), wherein the member comprises a plurality of data processing engines arranged as a multi-functional pipeline through which data is streamed, the pipelined data processing engines configured for operation in parallel with each other; each pipelined data processing engine being configured to (1) receive streaming data and perform a processing operation on the received streaming data, and (2) be responsive to a control instruction that defines whether that pipelined data processing engine is an activated data processing engine or a deactivated data processing engine, wherein an activated data processing engine is configured to perform its processing operation on streaming data received thereby, and wherein a deactivated data processing engine remains in the pipeline but does not perform its processing operation on streaming data received thereby, the multi-functional pipeline thereby being configured to provide a plurality of different pipeline functions in response to control instructions that are configured to selectively activate and deactivate the pipelined data processing engines, each pipeline function being the combined functionality of each activated pipelined data processing engine in the pipeline at a given time; wherein each of a plurality of the data processing engines is configured as a convolution engine that convolves first data with second data via correlation logic; wherein each of another plurality of the data processing engines is configured as a data reduction engine that performs a data reduction operation on data received thereby; and wherein the multi-functional pipeline is configured to activate a plurality of the convolution engines and a plurality of the data reduction engines at the same time in response to control instructions in order to configure the multi-functional pipeline as the feature extractor for the convolutional neural network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A machine-learning method comprising:
-
selectively activating a plurality of data processing engines in a multi-functional pipeline in response to a control instruction to define a feature extractor for a convolutional neural network, the multi-functional pipeline being resident on a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processing unit (GPU), and (3) a chip multi-processor (CMP); wherein the multi-functional pipeline comprises a plurality of data processing engines through which data is streamed, the pipelined data processing engines configured for operation in parallel with each other, each pipelined data processing engine being configured to (1) receive streaming data and perform a processing operation on the received streaming data, and (2) be responsive to the control instruction that defines whether that pipelined data processing engine is an activated data processing engine or a deactivated data processing engine; wherein an activated data processing engine is configured to perform its processing operation on streaming data received thereby; wherein a deactivated data processing engine remains in the pipeline but does not perform its processing operation on streaming data received thereby, the multi-functional pipeline thereby being configured to provide a plurality of different pipeline functions in response to the control instructions that are configured to selectively activate and deactivate the pipelined data processing engines, each pipeline function being the combined functionality of each activated pipelined data processing engine in the pipeline at a given time; wherein each of a plurality of the data processing engines is configured as a convolution engine that convolves first data with second data via correlation logic; wherein each of another plurality of the data processing engines is configured as a data reduction engine that performs a data reduction operation on data received thereby; and wherein a plurality of the selectively activated data processing engines comprise a plurality of the convolution engines and a plurality of the data reduction engines; streaming data into a first activated convolution engine in the multi-functional pipeline, the streaming data comprising (1) input data to be classified via the convolutional neural network as the first data and (2) weight data as the second data; and the activated pipelined data processing engines in the multi-functional pipeline performing their data processing operations on data received thereby to perform feature extraction on the input data as part of the convolutional neural network. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. A machine-learning apparatus comprising:
-
a member of the group consisting of (1) a reconfigurable logic device, (2) a graphics processing unit (GPU), and (3) a chip multi-processor (CMP), wherein the member comprises a plurality of data processing engines arranged as a multi-functional pipeline through which data is streamed, the pipelined data processing engines configured for operation in parallel with each other; each pipelined data processing engine being configured to (1) receive streaming data and perform a processing operation on the received streaming data, and (2) be responsive to a control instruction that defines whether that pipelined data processing engine is an activated data processing engine or a deactivated data processing engine, wherein an activated data processing engine is configured to perform its processing operation on streaming data received thereby, and wherein a deactivated data processing engine remains in the pipeline but does not perform its processing operation on streaming data received thereby, the multi-functional pipeline thereby being configured to provide a plurality of different pipeline functions in response to control instructions that are configured to selectively activate and deactivate the pipelined data processing engines, each pipeline function being the combined functionality of each activated pipelined data processing engine in the pipeline at a given time; wherein at least one of the data processing engines comprises a convolution engine that serves as a convolutional layer for a convolutional neural network to support machine-learning operations; and wherein the multi-functional pipeline is configured to selectively activate a plurality of the data processing engines, including the convolution engine, at the same time in response to control instructions in order to configure the multi-functional pipeline to support machine-learning operations. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
-
Specification