×

Multithreaded programmable processor and system with partitioned operations

  • US 7,987,344 B2
  • Filed: 01/16/2004
  • Issued: 07/26/2011
  • Est. Priority Date: 08/16/1995
  • Status: Expired due to Fees
First Claim
Patent Images

1. A programmable processor comprising:

  • a data path capable of transmitting data;

    an external interface operable to receive data from an external source and communicate the received data over the data path;

    a register file containing a plurality of registers each having a register width, the register file coupled to the data path and configured to support processing of a plurality of threads and to store a plurality of multiple-bit data elements in partitioned fields, each of the multiple-bit data elements having an elemental width smaller than the register width;

    an execution unit coupled to the data path, the execution unit configured to execute a plurality of instruction streams from the plurality of threads in a multistage pipeline such that the multistage pipeline is capable of including instructions from different ones of the instruction streams in different stages of the multistage pipeline, each instruction stream including a single arithmetic instruction that specifies an arithmetic operation to cause multiple instances of the arithmetic operation to be performed, each instance of the arithmetic operation to be performed using a different one of the plurality of multiple-bit data elements in partitioned fields of at least one of the registers to produce a catenated result, the single arithmetic instruction causing a plurality of multiple-bit data elements in partitioned fields to be read in parallel from a register included in the register file, and causing the catenated result to be written in parallel to one of the registers included in the register file; and

    wherein each of the multiple-bit data elements has an elemental width, and the data path has a data path width multiple times greater than the elemental width, to allow multiple-bit data elements used for the multiple instances of the arithmetic operation to be transmitted in parallel from the register file to the execution unit, and wherein the execution unit is operable to receive, in parallel, multiple-bit data elements for the multiple instances of the arithmetic operation and execute the multiple instances of the single arithmetic instruction to produce the catenated result.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×