Tuple encoding aware direct memory access engine for scratchpad enabled multi-core processors
First Claim
1. Electronic circuitry on a single chip comprising:
- a first set of electronic circuits;
a second set of electronic circuits;
wherein said electronic circuitry on said single chip is configured for;
in response to a particular memory location being pushed into a first register within a first register space that is accessible by said first set of electronic circuits;
said first set of electronic circuits accessing a descriptor stored at the particular memory location, wherein the descriptor indicates;
a width of a column of tabular data, a number of rows of said column of tabular data, and one or more tabular data manipulation operations to perform on said column of tabular data;
a source memory location for said column of tabular dataa destination memory location for a data manipulation result of said one or more tabular data manipulation operations; and
the first set of electronic circuits determining, based on the descriptor, control information indicating said one or more tabular data manipulation operations to perform on said column of tabular data;
the first set of electronic circuits transmitting, using a hardware data channel, the control information to a second set of electronic circuits to perform the one or more tabular data manipulation operations;
according to the control information, said second set of electronic circuits retrieving said column of tabular data from said source memory location;
applying said one or more tabular data manipulation operations to said column of tabular data to generate said data manipulation result; and
causing said data manipulation result to be stored at said destination memory location.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques provide for hardware accelerated data movement between main memory and an on-chip data movement system that comprises multiple core processors that operate on the tabular data. The tabular data is moved to or from the scratch pad memories of the core processors. While the data is in-flight, the data may be manipulated by data manipulation operations. The data movement system includes multiple data movement engines, each dedicated to moving and transforming tabular data from main memory data to a subset of the core processors. Each data movement engine is coupled to an internal memory that stores data (e.g. a bit vector) that dictates how data manipulation operations are performed on tabular data moved from a main memory to the memories of a core processor, or to and from other memories. The internal memory of each data movement engine is private to the data movement engine. Tabular data is efficiently copied between internal memories of the data movement system via a copy ring that is coupled to the internal memories of the data movement system and/or is coupled to a data movement engine. Also, a data movement engine internally broadcasts data to other data movement engines, which then transfer the data to respective core processors. Partitioning may also be performed by the hardware of the data movement system. Techniques are used to partition data “in flight”. The data movement system also generates a column of row identifiers (RIDs). A row identifier is a number treated as identifying a row or element'"'"'s position within a column. Row identifiers each identifying a row in column are also generated.
-
Citations
21 Claims
-
1. Electronic circuitry on a single chip comprising:
-
a first set of electronic circuits; a second set of electronic circuits; wherein said electronic circuitry on said single chip is configured for; in response to a particular memory location being pushed into a first register within a first register space that is accessible by said first set of electronic circuits; said first set of electronic circuits accessing a descriptor stored at the particular memory location, wherein the descriptor indicates; a width of a column of tabular data, a number of rows of said column of tabular data, and one or more tabular data manipulation operations to perform on said column of tabular data; a source memory location for said column of tabular data a destination memory location for a data manipulation result of said one or more tabular data manipulation operations; and the first set of electronic circuits determining, based on the descriptor, control information indicating said one or more tabular data manipulation operations to perform on said column of tabular data; the first set of electronic circuits transmitting, using a hardware data channel, the control information to a second set of electronic circuits to perform the one or more tabular data manipulation operations; according to the control information, said second set of electronic circuits retrieving said column of tabular data from said source memory location; applying said one or more tabular data manipulation operations to said column of tabular data to generate said data manipulation result; and causing said data manipulation result to be stored at said destination memory location. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. Electronic circuitry on a single chip, said electronic circuitry being configured for partitioning columns of rows among co-processors by:
-
for each data descriptor of a first set of data descriptors, copying a respective column of said columns that is at a source memory to an intermediate memory; wherein each data descriptor of said data descriptors specifies a width of the respective column each data descriptor; a number of rows; a respective source memory location for said respective column; a destination memory location within said intermediate memory; for a second descriptor that specifies a particular algorithm, generating, according to the particular algorithm, a column of core processor identifiers that are each indexed to a respective row of said rows and that identify a respective core processor of said core processors; for each core partitioning descriptor of a set of core partitioning descriptors, copying each row of a respective column of said columns from said intermediate memory to a scratch pad memory of the core processor identified by the respective core processor identifier indexed to said each row, said respective core processor being indexed to said each row in said column of core processor identifiers; wherein each core partitioning descriptor of said set of core partitioning descriptors specifies a width of the respective column of said each core partitioning descriptor; a number of rows; a respective source memory location in said intermediate memory for the respective column of each core partitioning descriptor; a destination memory location. - View Dependent Claims (8)
-
-
9. Electronic circuitry on a single chip comprising:
-
a plurality of core processors; a plurality of DMEMs (direct memories); a plurality of first blocks of circuitry, wherein each core processor of said plurality of core processors is connected to a respective DMEM of said plurality of DMEMs and a respective first block of said plurality of first blocks of circuitry that is connected to the respective DMEM of said each core processor; a plurality of second blocks circuitry; for each separate subset of multiple core processors of said plurality of core processors, a respective second block of said plurality of second blocks is connected to the respective first block of each core processor of said each separate subset of multiple core processors; wherein a third block circuitry is connected to each of said plurality of second blocks; wherein said particular core processor is connected to a particular first block of said plurality of first blocks, wherein said particular first block is connected to a particular second block of said plurality of second blocks, wherein said particular second block of said plurality of second blocks is connected to said third block, wherein said particular register is accessible to said particular core processor and said particular first block connected to said particular core processor; wherein said electronic circuitry is configured for, in response to a particle core processor pushing a particular memory location onto a particular register; said particular first block to a descriptor stored at the particular memory location, wherein the descriptor indicates one or more tabular data manipulation operations to perform on a column of tabular data, wherein the descriptor includes a plurality of separate fields that include; a field specifying a width of the column of tabular data; a field specifying a number of rows of said column of tabular data; a field specifying a source memory location for said column of tabular data; and a field specifying a destination memory location for a data manipulation result of said one or more tabular data manipulation operations; the particular first block to transmit control information of the descriptor to said third block via said particular second block; the third block to retrieve, based on the control information of the descriptor, the column of tabular data from the source memory location; the third block to perform, based on the control information, the one or more tabular data manipulation operations on the column of the tabular data to generate the data manipulation result; the third block to transmit via said particular second block, based on the control information, the data manipulation result to the particular first block; the first block to cause the data manipulation result to be stored at said destination memory location within the respective DMEM of said particular core processor. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification