QR decomposition in an integrated circuit device
First Claim
1. Circuitry for performing QR decomposition of an input matrix, said circuitry comprising:
- a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; and
a second datapath for performing an inverse square root operation and multiplication operations on output of said summer of said first datapath and on said inverse square root operation;
wherein;
on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; and
on a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations.
1 Assignment
0 Petitions
Accused Products
Abstract
Circuitry speeds up the QR decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. This implementation performs Gram-Schmidt orthogonalization with no dependencies between iterations. QR decomposition of a matrix can be performed by processing entire columns at once as a vector operation. Data dependencies within and between matrix columns are removed, as later functions dependent on an earlier result may be generated from partial results somewhere in the datapath, rather than from an earlier completed result. Different passes through the matrix are timed so that different computations requiring the same functional units arrive at different time slots. After the Q matrix has been calculated, the R matrix may be calculated from the Q matrix by taking its transpose and multiplying the transpose by the original input matrix.
344 Citations
29 Claims
-
1. Circuitry for performing QR decomposition of an input matrix, said circuitry comprising:
-
a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; and a second datapath for performing an inverse square root operation and multiplication operations on output of said summer of said first datapath and on said inverse square root operation;
wherein;on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; and on a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of configuring a programmable integrated circuit device as circuitry for performing QR decomposition of an input matrix, said method comprising:
-
configuring logic of said programmable integrated circuit device as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; and configuring logic of said programmable integrated circuit device as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath;
wherein;on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; and on a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A programmable integrated circuit device comprising:
-
logic configurable as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; and logic configurable as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath;
wherein, when said logic configurable as a first datapath and said logic configurable as a second datapath are configured, respectively, as said first and second datapath;on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; and on a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
-
23. A machine-readable data storage medium encoded with machine-executable instructions for configuring a programmable integrated circuit device as circuitry for performing QR decomposition of an input matrix, said instructions comprising:
-
instructions to configure logic of said programmable integrated circuit device as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; and instructions to configure logic of said programmable integrated circuit device as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath;
wherein;on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column and multiplies said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; and on a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. - View Dependent Claims (24, 25, 26, 27, 28, 29)
-
Specification