Matrix decomposition in an integrated circuit device
First Claim
1. Matrix decomposition circuitry for triangulating an input matrix to create a resultant matrix having a plurality of resultant matrix elements on a diagonal, and having a further plurality of resultant matrix elements arranged in columns below said resultant matrix elements on said diagonal, and having no nonzero matrix elements in said columns above said resultant matrix elements on said diagonal, said matrix decomposition circuitry comprising:
- an inner product computation path including a plurality of multipliers and adders for computing respective inner products of row vectors of said resultant matrix corresponding to respective elements of said input matrix, and a subtractor for subtracting each said respective inner product from said respective element of said input matrix to output respective inner product difference elements corresponding to said respective elements of said input matrix;
an inverse square root multiplication path including a first storage element for latching a particular one of said respective inner product difference elements, inverse square root circuitry that computes an inverse square root of said particular one of said respective inner product difference elements, and a multiplier that multiplies said inverse square root by an output of said inner product computation path; and
a pipeline stage between said inner product computation path and said inverse square root multiplication path;
whereby;
each of said respective inner product difference elements is multiplied by said inverse square root of said particular one of said respective inner product difference elements to output a respective one of resultant matrix elements.
1 Assignment
0 Petitions
Accused Products
Abstract
Circuitry speeds up the Cholesky decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. The circuitry implements the following equation:
When any lij term is calculated this way, the latency in calculating the ljj term in the denominator has little or no effect on the lij term calculation. And if the calculations are properly pipelined, once the pipeline is filled, a new term can be output on each clock cycle or every few clock cycles.
-
Citations
12 Claims
-
1. Matrix decomposition circuitry for triangulating an input matrix to create a resultant matrix having a plurality of resultant matrix elements on a diagonal, and having a further plurality of resultant matrix elements arranged in columns below said resultant matrix elements on said diagonal, and having no nonzero matrix elements in said columns above said resultant matrix elements on said diagonal, said matrix decomposition circuitry comprising:
-
an inner product computation path including a plurality of multipliers and adders for computing respective inner products of row vectors of said resultant matrix corresponding to respective elements of said input matrix, and a subtractor for subtracting each said respective inner product from said respective element of said input matrix to output respective inner product difference elements corresponding to said respective elements of said input matrix; an inverse square root multiplication path including a first storage element for latching a particular one of said respective inner product difference elements, inverse square root circuitry that computes an inverse square root of said particular one of said respective inner product difference elements, and a multiplier that multiplies said inverse square root by an output of said inner product computation path; and a pipeline stage between said inner product computation path and said inverse square root multiplication path;
whereby;each of said respective inner product difference elements is multiplied by said inverse square root of said particular one of said respective inner product difference elements to output a respective one of resultant matrix elements. - View Dependent Claims (2, 3)
-
-
4. A method of configuring a programmable integrated circuit device as matrix decomposition circuitry for triangulating an input matrix to create a resultant matrix having a plurality of resultant matrix elements on a diagonal, and having a further plurality of resultant matrix elements arranged in columns below said resultant matrix elements on said diagonal, and having no nonzero matrix elements in said columns above said resultant matrix elements on said diagonal, said method comprising:
-
configuring logic of said programmable integrated circuit device as an inner product computation path including a plurality of multipliers and adders for computing respective inner products of row vectors of said resultant matrix corresponding to respective elements of said input matrix, and a subtractor for subtracting each said respective inner product from said respective element of said input matrix to output respective inner product difference elements corresponding to said respective elements of said input matrix; configuring logic of said programmable integrated circuit device as an inverse square root multiplication path including a first storage element for latching a particular one of said respective inner product difference elements, inverse square root circuitry that computes an inverse square root of said particular one of said respective inner product difference elements, and a multiplier that multiplies said inverse square root by an output of said inner product computation path; and configuring logic of said programmable integrated circuit device as a pipeline stage between said inner product computation path and said inverse square root multiplication path;
whereby;said configured logic multiplies each of said respective inner product difference elements by said inverse square root of said particular one of said respective inner product difference elements to output a respective one of resultant matrix elements. - View Dependent Claims (5, 6)
-
-
7. A programmable integrated circuit device comprising:
-
logic configurable as an inner product computation path including a plurality of multipliers and adders for computing respective inner products of row vectors of said resultant matrix corresponding to respective elements of said input matrix, and a subtractor for subtracting each said respective inner product from said respective element of said input matrix to output respective inner product difference elements corresponding to said respective elements of said input matrix; logic configurable as an inverse square root multiplication path including a first storage element for latching a particular one of said respective inner product difference elements, inverse square root circuitry that computes an inverse square root of said particular one of said respective inner product difference elements, and a multiplier that multiplies said inverse square root by an output of said inner product computation path; and logic configurable as a pipeline stage between said inner product computation path and said inverse square root multiplication path;
whereby;said logic is configurable to multiply each of said respective inner product difference elements by said inverse square root of said particular one of said respective inner product difference elements to output a respective one of resultant matrix elements. - View Dependent Claims (8, 9)
-
-
10. A machine-readable data storage medium encoded with machine-executable instructions for configuring a programmable integrated circuit device as matrix decomposition circuitry for triangulating an input matrix to create a resultant matrix having a plurality of resultant matrix elements on a diagonal, and having a further plurality of resultant matrix elements arranged in columns below said resultant matrix elements on said diagonal, and having no nonzero matrix elements in said columns above said resultant matrix elements on said diagonal, said instructions comprising:
-
instructions to configure logic of said programmable integrated circuit device as an inner product computation path including a plurality of multipliers and adders for computing respective inner products of row vectors of said resultant matrix corresponding to respective elements of said input matrix, and a subtractor for subtracting each said respective inner product from said respective element of said input matrix to output respective inner product difference elements corresponding to said respective elements of said input matrix; instructions to configure logic of said programmable integrated circuit device as an inverse square root multiplication path including a first storage element for latching a particular one of said respective inner product difference elements, inverse square root circuitry that computes an inverse square root of said particular one of said respective inner product difference elements, and a multiplier that multiplies said inverse square root by an output of said inner product computation path; and instructions to configure logic of said programmable integrated circuit device as a pipeline stage between said inner product computation path and said inverse square root multiplication path;
whereby;said instructions cause logic to be configured to multiply each of said respective inner product difference elements by said inverse square root of said particular one of said respective inner product difference elements to output a respective one of resultant matrix elements. - View Dependent Claims (11, 12)
-
Specification