×

Method and system for converting a single-threaded software program into an application-specific supercomputer

  • US 10,146,516 B2
  • Filed: 09/06/2016
  • Issued: 12/04/2018
  • Est. Priority Date: 11/15/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method, implemented by a compiler, to create an incomplete butterfly sub-network, where r >

  • =2 is a radix of the incomplete butterfly sub-network and is a power of two; and

    where a number of input ports m >

    =1 of the incomplete butterfly sub-network is not a power of r, and/or a number of output ports n >

    =1 of the incomplete butterfly sub-network is not a power of r; and

    where the incomplete butterfly sub-network is obtained from a corresponding complete butterfly sub-network consisting of multiplexers, buffers, and wires, where a number of input ports and a number of output ports of the corresponding complete butterfly sub-network are both equal to rd, where d is a smallest integer that makes rd greater than or equal to a maximum of m and n, by;

    retaining, in the incomplete butterfly sub-network, only multiplexers, buffers, and wires of the corresponding complete butterfly sub-network required for routing packets from a first m input ports of the corresponding complete butterfly sub-network to a first n output ports of the corresponding complete butterfly sub-network, anddeleting any remaining multiplexers, buffers, and wires of the corresponding complete butterfly sub-network; and

    where the compiler automatically translates a single-threaded software program code fragment into a partitioned application-specific supercomputer functionally equivalent to the single-threaded software program code fragment, in part by creating one or more customized incomplete butterfly sub-networks for scalable message communication between hardware components of the partitioned application-specific supercomputer, where each customized incomplete butterfly sub-network among the one or more customized incomplete butterfly sub-networks has a minimum number of input ports, a minimum number of output ports, and a minimum number of payload bits per port for reducing area, power, and message communication latency.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×