Computer Implemented Method and Program for Fast Estimation of Matrix Characteristic Values
1 Assignment
0 Petitions
Accused Products
Abstract
A term-by-document (or part-by-collection) matrix can be used to index documents (or collections) for information retrieval applications. Reducing the rank of the indexing matrix can further reduce the complexity of information retrieval. A method for index matrix rank reduction can involve computing a singular value decomposition and then retaining singular values based on the singular values corresponding to singular values of multiple topics. The expected singular values corresponding to a topic can be determined using the roots of a specially formed characteristic polynomial. The coefficients of the special characteristic polynomial can be based on computing the determinants of a Gram matrix of term (or part) probabilities, a method of recursion, or a method of recursion further weighted by the probability of document (or collection) lengths.
10 Citations
71 Claims
-
1-31. -31. (canceled)
-
32. A program product including a computer readable computer program encoded in a storage medium, the computer program executing an algorithm for computing a characteristic value of a matrix, comprising steps of:
-
providing a first matrix; providing a probability model by sampling from the first matrix; and extrapolating from the probability model a characteristic value of the first matrix. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51)
-
33. The program product according to claim 32, wherein the characteristic value is a singular value of the first matrix.
-
34. The program product according to claim 32, wherein the characteristic value is an eigenvalue of the first matrix.
-
35. The program product according to claim 32, wherein the probability model comprises:
-
a probability distribution corresponding to a set of elements; a probability distribution corresponding to a set of sample lengths; and a probability of a sample from the probability model.
-
-
36. The program product according to claim 32, wherein the step of extrapolating from the probability model comprises the steps of:
-
constructing a polynomial corresponding to the probability model; and extrapolating from the polynomial to obtain the characteristic value of the first matrix.
-
-
37. The program product according to claim 36, wherein the step of extrapolating from the polynomial comprises the steps of:
-
finding a root of the polynomial for the probability model of the first matrix; and extrapolating from the root of the polynomial to obtain the characteristic value of the first matrix.
-
-
38. The program product according to claim 37, wherein the step of extrapolating from the root of the polynomial comprises the steps of:
-
multiplying the root by an expected number of samples from the probability model; and taking the square root of the resulting value to obtain the characteristic value.
-
-
39. The program product according to claim 36, wherein the step of constructing the polynomial comprises the steps of:
-
computing one or more coefficients ci; and dividing each computed coefficient ci by i! to obtain the polynomial wherein each of the non-computed coefficients is set to zero and t is from the probability model.
-
-
40. The program product according to claim 39, wherein the step of computing a coefficient ci of the polynomial comprises computing a probabilistically weighted value
-
l c i ( l ) prob ( l ) comprising an additional coefficient ci(l) corresponding to a length l and probability prob(l) from the probability model to obtain the coefficient ci.
-
-
41. The program product according to claim 40, wherein the step of computing the coefficient ci(l) of the polynomial for the length l comprises computing a determinant of a Gram matrix |BiTBi|, wherein Bi is a matrix whose i columns are i copies of a column vector of probabilities from the probability model that depend on l, to obtain the coefficient ci(l).
-
42. The program product according to claim 40, wherein the step of computing the coefficient ci(l) of the polynomial for the length l comprises using a recursive formula to obtain the coefficient ci(l).
-
43. The program product according to claim 42, wherein the recursive formula comprises
-
( l ) = ∑ j = 0 n ( n j ) ( - 1 ) j ( j ! ) a j + 1 c n - j ( l ) based on the length l, coefficients cn−
j(l) with c0(l)=1, and traces of powers of a second matrix M wherein an=trace(Mn).
-
-
44. The program product according to claim 43, wherein the second matrix M comprises expected values of products of pairs of probabilities according to the corresponding length l from the probability model.
-
45. The program product according to claim 43, wherein the second matrix M comprises
-
j wherein pi and pj and l are from the probability model.
-
-
46. The program product according to claim 43, wherein the second matrix M comprises
-
( 1 - p i ) l - 2 ( 1 - p j ) l + 4 ( 1 - p i - p j ) l , i ≠ j wherein pi and pj and l are from the probability model.
-
-
47. The program product according to claim 43, wherein the second matrix M comprises
-
[ j 1 + … + j t = l ] log j i log j j ( l j 1 … j t ) p 1 j 1 … p t j t wherein pi and pj and l are from the probability model.
-
-
48. The program product according to claim 43, wherein the second matrix M comprises
-
( l - 1 ) p i 2 + lp i ) , i = j l ( l - 1 ) p i p j , i ≠ j wherein pi and pj and l are from the probability model.
-
-
49. The program product according to claim 43, wherein the second matrix M comprises
-
[ j 1 + … + j t = l ] j i j j ( l j 1 … j t ) p 1 j 1 … p t j t wherein pi and pj and l are from the probability model.
-
-
50. The program product according to claim 43, wherein the second matrix M comprises
-
( t - 1 ) l , i = j t l - 2 ( t - 1 ) l + ( t - 2 ) l t l = ∑ k = 0 t - 3 ( t - 3 k ) ( t - k - 1 ) ! S ( l + 1 , t - k ) t l , i ≠ j wherein t and l are from the probability model and S(n,k) are the Stirling numbers of the second kind.
-
-
51. The program product according to claim 43, wherein the step of computing traces of powers of the second matrix M comprises summing j+1 powers of eigenvalues of the second matrix M to obtain the value aj+1 for use in the recursive formula.
-
33. The program product according to claim 32, wherein the characteristic value is a singular value of the first matrix.
-
-
52. A program product including a computer readable computer program encoded in a storage medium, the computer program executing an algorithm for computing an expected characteristic value of matrices created from a probability model comprising the steps of:
-
providing a probability model; and extrapolating from the probability model an expected characteristic value of matrices created from the probability model. - View Dependent Claims (53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71)
-
53. The program product according to claim 52, wherein the characteristic value is a singular value.
-
54. The program product according to claim 52, wherein the characteristic value is an eigenvalue.
-
55. The program product according to claim 52, wherein the probability model comprises:
-
a probability distribution corresponding to a set of elements; a probability distribution corresponding to a set of sample lengths; and a probability of a sample from the probability model.
-
-
56. The program product according to claim 52, wherein the step of extrapolating from the probability model comprises the steps of:
-
constructing a polynomial corresponding to the probability model; and extrapolating from the polynomial to obtain the characteristic value.
-
-
57. The program product according to claim 56, wherein the step of extrapolating from the polynomial comprises the steps of:
-
finding a root of the polynomial for the probability model; and extrapolating from the root of the polynomial to obtain the characteristic value.
-
-
58. The program product according to claim 57, wherein the step of extrapolating from the root of the polynomial comprises the steps of:
-
multiplying the root by an expected number of samples from the probability model; and taking the square root of the resulting value to obtain the characteristic value.
-
-
59. The program product according to claim 56, wherein the step of constructing the polynomial comprises the steps of:
-
computing one or more coefficients ci; and dividing each computed coefficient ci by i! to obtain the polynomial wherein each of the non-computed coefficients is set to zero and t is from the probability model.
-
-
60. The program product according to claim 59, wherein the step of computing a coefficient ci of the polynomial comprises computing a probabilistically weighted value
-
l c i ( l ) prob ( l ) comprising an additional coefficient ci(l) corresponding to a length l and probability prob(l) from the probability model to obtain the coefficient ci.
-
-
61. The program product according to claim 60, wherein the step of computing the coefficient ci(l) of the polynomial for the length l comprises computing a determinant of a Gram matrix |BiTBi|, wherein Bi is a matrix whose i columns are i copies of a column vector of probabilities from the probability model that depend on l, to obtain the coefficient ci(l).
-
62. The program product according to claim 60, wherein the step of computing the coefficient ci(l) of the polynomial for the length l comprises using a recursive formula to obtain the coefficient ci(l).
-
63. The program product according to claim 62, wherein the recursive formula comprises
-
( l ) = ∑ j = 0 n ( n j ) ( - 1 ) j ( j ! ) a j + 1 c n - j ( l ) based on the length l, coefficients cn-j(l) with c0(l)=1, and traces of powers of a second matrix M wherein an=trace(Mn).
-
-
64. The program product according to claim 63, wherein the second matrix M comprises expected values of products of pairs of probabilities according to the corresponding length/from the probability model.
-
65. The program product according to claim 63, wherein the second matrix M comprises
-
j wherein pi and pj and l are from the probability model.
-
-
66. The program product according to claim 63, wherein the second matrix M comprises
-
( 1 - p i ) l - 2 ( 1 - p j ) l + 4 ( 1 - p i - p j ) l , i ≠ j wherein pi and pj and l are from the probability model.
-
-
67. The program product according to claim 63, wherein the second matrix M comprises
-
[ j 1 + … + j t = l ] log j i log j j ( l j 1 … j t ) p 1 j 1 … p t j t wherein pi and pj and l are from the probability model.
-
-
68. The program product according to claim 63, wherein the second matrix M comprises
-
( l - 1 ) p i 2 + lp i ) , i = j l ( l - 1 ) p i p j , i ≠ j wherein pi and pj and l are from the probability model.
-
-
69. The program product according to claim 63, wherein the second matrix M comprises
-
[ j 1 + … + j t = l ] j i j j ( l j 1 … j t ) p 1 j 1 … p t j t wherein pi and pj and l are from the probability model.
-
-
70. The program product according to claim 63, wherein the second matrix M comprises
-
( t - 1 ) l , i = j t l - 2 ( t - 1 ) l + ( t - 2 ) l t l = ∑ k = 0 t - 3 ( t - 3 k ) ( t - k - 1 ) ! S ( l + 1 , t - k ) t l , i ≠ j wherein t and l are from the probability model and S(n,k) are the Stirling numbers of the second kind.
-
-
71. The program product according to claim 63, wherein the step of computing traces of powers of the second matrix M comprises summing j+1 powers of eigenvalues of the second matrix M to obtain the value aj−
- 1 for use in the recursive formula.
-
53. The program product according to claim 52, wherein the characteristic value is a singular value.
-
Specification
- Resources
-
Current AssigneeSelective, Inc.
-
Original AssigneeSelective, Inc.
-
InventorsCanfield, Earl Rodney, Martin, Jacob Gilmore
-
Application NumberUS12/632,062Publication NumberTime in Patent OfficeDaysField of SearchUS Class Current707/750CPC Class CodesG06F 16/31 Indexing; Data structures t...