US 20070233669A2 - Selective Latent Semantic Indexing Method for Information Retrieval Applications

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A term-by-document (or part-by-collection) matrix can be used to index documents (or collections) for information retrieval applications. Reducing the rank of the indexing matrix can further reduce the complexity of information retrieval. A method for index matrix rank reduction can involve computing a singular value decomposition and then retaining singular values based on the singular values corresponding to singular values of multiple topics. The expected singular values corresponding to a topic can be determined using the roots of a specially formed characteristic polynomial. The coefficients of the special characteristic polynomial can be based on computing the determinants of a Gram matrix of term (or part) probabilities, a method of recursion, or a method of recursion further weighted by the probability of document (or collection) lengths.

15 Citations

View as Search Results

71 Claims

1. A method for generating a reduced rank matrix approximation for information retrieval, comprising:
- forming a term-by-document matrix A, wherein the elements of the matrix A represent a plurality of terms within a plurality of documents, the documents related to a plurality of topics;
  
  performing a singular value decomposition of the matrix A to identify a plurality of actual singular values each having a corresponding singular vector;
  
  selecting a subset of the actual singular values based on singular values that correspond to at least one desired topic; and
  
  determining a set of singular vectors based on the selected singular values, wherein the singular vectors provide an index for use during information retrieval.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, further comprising the step of computing a reduced rank matrix based on the selected singular values and the selected singular vectors, the reduced rank matrix providing an index for use during information retrieval.
  - 3. The method of claim 1, further comprising the step of identifying a probabilistic matrix model, the model comprising at least one of a plurality of probabilities representing the number of occurrences of the terms within the documents.
  - 4. The method of claim 1, wherein the selecting step comprises estimating a plurality of singular values corresponding to at least one of the topics and selecting actual singular values of matrix A that correspond to the estimated singular values.
  - 5. The method of claim 4, wherein the selecting step further comprises matching the estimated singular values corresponding to one or more topics with the actual singular values of matrix A.
  - 6. The method of claim 4, wherein the selecting step further comprises selecting a plurality of actual singular values of matrix A that each correspond to at least one of the estimated singular values, the actual singular values corresponding to at least two of the topics.
  - 7. The method of claim 4, wherein the step of estimating singular values corresponding to a topic comprises the steps of:
    - generating characteristic coefficients;
      
      forming a special characteristic polynomial based on the characteristic coefficients; and
      
      solving for the roots of the characteristic polynomial, wherein multiplying the roots by the number of documents related to the topic and then taking a square-root yields the estimated singular values.
  - 8. The method of claim 7, wherein the step of generating characteristic coefficients comprises, for each coefficient:
    - forming a vector representing probabilities of the terms in the documents;
      
      forming a matrix B with copies of the vector as its columns; and
      
      computing the determinant of the Gram matrix given by |B^TB| to generate each coefficient.
  - 9. The method of claim 7, wherein the step of generating characteristic coefficients comprises computing a recursion such that each coefficient is based on those coefficients already computed.
  - 10. The method of claim 9, wherein the step of generating characteristic coefficients further comprises computing a probabilistically weighted average of coefficients based on the probability of document length, the weighted averaging allowing the method to function with documents of non-uniform lengths.

11. A method for generating a reduced rank matrix approximation for information retrieval, comprising:
- forming a part-by-collection matrix A, wherein the elements of the matrix represent the existence of one or more parts within one or more collections;
  
  performing a singular value decomposition of the matrix A to identify a plurality of actual singular values each having a corresponding singular vector;
  
  selecting a subset of the actual singular values based on singular values that correspond to at least one desired probabilistic distribution on parts; and
  
  determining a set of singular vectors based on the selected singular values, wherein the singular vectors provide an index for use during information retrieval.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The method of claim 11, further comprising the step of computing a reduced rank matrix based on the selected singular values and the selected singular vectors, the reduced rank matrix providing an index for use during information retrieval.
  - 13. The method of claim 11, further comprising the step of identifying a probabilistic matrix model, the model comprising one or more probabilities representing the number of occurrences of one or more parts within one or more collections, the collections relating to a probabilistic distribution on parts.
  - 14. The method of claim 11, wherein the selecting step comprises estimating singular values corresponding to at least one of a plurality of probabilistic distributions on parts and selecting actual singular values of matrix A that correspond to the estimated singular values.
  - 15. The method of claim 14, wherein the selecting step further comprises matching the estimated singular values corresponding to at least one of a plurality of probabilistic distributions on parts with the actual singular values of matrix A.
  - 16. The method of claim 14, wherein the selecting step further comprises selecting at least one of the actual singular values of matrix A that substantially equal at least one of the estimated singular values corresponding to at least one of the desired probabilistic distributions on parts.
  - 17. The method of claim 14, wherein the step of estimating singular values corresponding to a probabilistic distribution on parts comprises the steps of:
    - generating characteristic coefficients;
      
      forming a characteristic polynomial; and
      
      solving for the roots of the characteristic polynomial, wherein multiplying the roots by the number of collections related to the probabilistic distribution on parts and then taking a square-root yields the estimated singular values.
  - 18. The method of claim 17, wherein the step of generating characteristic coefficients comprises, for each coefficient:
    - forming a vector representing probabilities of the parts;
      
      forming a matrix B with copies of the vector as its columns; and
      
      computing the determinant of the Gram matrix given by |B^TB| to generate each coefficient.
  - 19. The method of claim 18, wherein the step of generating characteristic coefficients comprises computing a recursion such that each coefficient is based on those coefficients already computed.
  - 20. The method of claim 19, wherein the step of generating characteristic coefficients further comprises computing a probabilistically weighted average of coefficients based on the probability of collection size, the weighted averaging allowing the method to function with collections of non-uniform sizes.

21. A method for reducing the rank of indexing matrix A for information retrieval, comprising:
- determining the singular values of matrix A;
  
  grouping the singular values of matrix A based on their correspondence to each of a plurality of topics; and
  
  computing the reduced rank matrix based on the grouping of the singular values, wherein the reduced rank matrix provides an index for use during information retrieval.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
- - 22. The method of claim 21, wherein the grouping step comprises estimating singular values corresponding to at least one of the topics and selecting actual singular values of matrix A that correspond to the estimated singular values.
  - 23. The method of claim 21, wherein the determining step comprises performing a singular value decomposition of the matrix A to identify a plurality of actual singular values each having a corresponding singular vector.
  - 24. The method of claim 22, wherein the selecting step comprises matching the estimated singular values corresponding to at least one of the topics with the actual singular values of matrix A.
  - 25. The method of claim 21, wherein the computing step further comprises selecting at least one of the singular values grouped to correspond to a desired topic, wherein the selecting step selects the singular values to be computed into the reduced rank matrix to maintain the indexing of the desired topic for use in information retrieval.
  - 26. The method of claim 22, wherein the step of estimating singular values corresponding to a topic comprises the steps of:
    - generating characteristic coefficients;
      
      forming a special characteristic polynomial; and
      
      solving for the roots of the characteristic polynomial, wherein multiplying the roots by the number of documents related to corresponding topics and then taking a square-root yields the estimated singular values.
  - 27. The method of claim 26, wherein the step of generating characteristic coefficients comprises, for each coefficient:
    - forming a vector representing probabilities of the terms;
      
      forming a matrix B with copies of the vector as its columns; and
      
      computing the determinant of the Gram matrix given by |B^TB| to generate each coefficient.
  - 28. The method of claim 26, wherein the step of generating characteristic coefficients comprises computing a recursion such that each subsequent coefficient is based on at least one coefficient already computed.
  - 29. The method of claim 28, wherein the step of generating characteristic coefficients further comprises computing a probabilistically weighted average of coefficients based on the probability of document length, the weighted averaging allowing the method to function with documents of non-uniform lengths.

30. A method for identifying singular values to generate a reduced rank approximation of a term-by-document matrix for use in information retrieval, comprising the step of estimating a plurality of singular values of the matrix, the estimated singular values each corresponding to at least one topic.

31. A method for generating a reduced rank matrix approximation for information retrieval, comprising:
- forming a term-by-document matrix A, wherein the elements of the matrix A represent a plurality of terms within a plurality of documents, each of the documents being related to at least one of plurality of topics;
  
  performing a singular value decomposition of the matrix A to identify a plurality of actual singular values each having a corresponding singular vector;
  
  estimating a plurality of estimated singular values each corresponding to at least one of the topics;
  
  selecting at least one of the actual singular values based on the estimated singular values; and
  
  computing a reduced rank matrix based on the selected singular values, wherein the reduced rank matrix provides an index for use during information retrieval.

32. A program product including a computer readable computer program encoded in a storage medium, the computer program executing an algorithm for computing a characteristic value of a matrix, comprising steps of:
- providing a first matrix;
  
  providing a probability model by sampling from the first matrix; and
  
  extrapolating from the probability model a characteristic value of the first matrix.
- View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51)
- - 33. The program product according to claim 32, wherein the characteristic value is a singular value of the first matrix.
  - 34. The program product according to claim 32, wherein the characteristic value is an eigenvalue of the first matrix.
  - 35. The program product according to claim 32, wherein the probability model comprises:
    - a probability distribution corresponding to a set of elements;
      
      a probability distribution corresponding to a set of sample lengths; and
      
      a probability of a sample from the probability model.
  - 36. The program product according to claim 32, wherein the step of extrapolating from the probability model comprises the steps of:
    - constructing a polynomial corresponding to the probability model; and
      
      extrapolating from the polynomial to obtain the characteristic value of the first matrix.
  - 37. The program product according to claim 36, wherein the step of extrapolating from the polynomial comprises the steps of:
    - finding a root of the polynomial for the probability model of the first matrix; and
      
      extrapolating from the root of the polynomial to obtain the characteristic value of the first matrix.
  - 38. The program product according to claim 37, wherein the step of extrapolating from the root of the polynomial comprises the steps of:
    - multiplying the root by an expected number of samples from the probability model; and
      
      taking the square root of the resulting value to obtain the characteristic value.
  - 39. The program product according to claim 36, wherein the step of constructing the polynomial comprises the steps of:
    - computing one or more coefficients c_i; and
      
      dividing each computed coefficient c_iby i! to obtain the polynomial $λ^{t} - \frac{c_{1}}{1!} λ^{t - 1} + \frac{c_{2}}{2!} λ^{t - 2} - \dots \pm \frac{c_{t}}{t!} λ^{0}$ wherein each of the non-computed coefficients is set to zero and t is from the probability model.
  - 40. The program product according to claim 39, wherein the step of computing a coefficient c_iof the polynomial comprises computing a probabilistically weighted value $c_{i} = \sum$
    - l ⁢
      
      ⁢
      
      c i ⁡
      
      ( l ) ⁢
      
      prob ⁢
      
      ⁢
      
      ( l ) comprising an additional coefficient c_i(l) corresponding to a length l and probability prob(l) from the probability model to obtain the coefficient c_i.
  - 41. The program product according to claim 40, wherein the step of computing the coefficient c_i(l) of the polynomial for the length l comprises computing a determinant of a Gram matrix |B_i^TB|, wherein B_iis a matrix whose i columns are i copies of a column vector of probabilities from the probability model that depend on l, to obtain the coefficient c_i(l).
  - 42. The program product according to claim 40, wherein the step of computing the coefficient c_i(l) of the polynomial for the length l comprises using a recursive formula to obtain the coefficient c_i(l).
  - 43. The program product according to claim 42, wherein the recursive formula comprises $c_{n + 1}$
    - ( l ) = ∑
      
      j = 0 n ⁢
      
      ⁢
      
      ( n j ) ⁢
      
      ( - 1 ) j ⁢
      
      ( j ! ) ⁢
      
      a j + 1 ⁢
      
      c n - j ⁡
      
      ( l ) based on the length l, coefficients c_n−
      
      j(l) with c₀(l)=1, and traces of powers of a second matrix M wherein a_n=trace(Mⁿ).
  - 44. The program product according to claim 43, wherein the second matrix M comprises expected values of products of pairs of probabilities according to the corresponding length l from the probability model.
  - 45. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1 - {(1 - p_{i})}^{l}, \\ 1 - {(1 - p_{i})}^{l} - {(1 - p_{j})}^{l} - {(1 - p_{i} - p_{j})}^{l}, \end{matrix}$
    - i = j i ≠
      
      j wherein p_iand p_jand l are from the probability model.
  - 46. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1, \\ 1 - 2 \end{matrix}$
    - ( 1 - p i ) l - 2 ⁢
      
      ( 1 - p j ) l + 4 ⁢
      
      ( 1 - p i - p j ) l , ⁢
      
      i = j i ≠
      
      j wherein p_iand p_jand l are from the probability model.
  - 47. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = \sum$
    - [ j 1 + ⋯
      
      + j t = l ] ⁢
      
      ⁢
      
      log ⁢
      
      ⁢
      
      j i ⁢
      
      log ⁢
      
      ⁢
      
      j j ⁡
      
      ( l j 1 ⁢
      
      ⋯
      
      j t ) ⁢
      
      p 1 j 1 ⁢
      
      ⋯
      
      ⁢
      
      p t j t wherein p_iand p_jand l are from the probability model.
  - 48. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} (l \end{matrix}$
    - ( l - 1 ) ⁢
      
      p i 2 + l ⁢
      
      ⁢
      
      p i ) , l ⁡
      
      ( l - 1 ) ⁢
      
      p i ⁢
      
      p j , ⁢
      
      i = j i ≠
      
      j wherein p_iand p_jand l are from the probability model.
  - 49. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = \sum$
    - [ j 1 + ⋯
      
      + j t = l ] ⁢
      
      ⁢
      
      j i ⁢
      
      j j ⁡
      
      ( l j 1 ⁢
      
      ⋯
      
      ⁢
      
      j t ) ⁢
      
      p 1 j 1 ⁢
      
      ⋯
      
      ⁢
      
      p t j t wherein p_iand p_jand l are from the probability model.
  - 50. The program product according to claim 43, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1 - t^{- l} \end{matrix}$
    - ( t - 1 ) l ,
      
      i = j t l - 2 ⁢
      
      ( t - 1 ) l + ( t - 2 ) l t l = ∑
      
      k = 0 t - 3 ⁢
      
      ⁢
      
      ( t - 3 k ) ⁢
      
      ( t - k - 1 ) ! ⁢
      
      S ⁡
      
      ( l + 1 , t - k )
      
      t l , i ≠
      
      j wherein t and l are from the probability model and S(n,k) are the Stirling numbers of the second kind.
  - 51. The program product according to claim 43, wherein the step of computing traces of powers of the second matrix M comprises summing j+1 powers of eigenvalues of the second matrix M to obtain the value a_j+1for use in the recursive formula.

52. A program product including a computer readable computer program encoded in a storage medium, the computer program executing an algorithm for computing an expected characteristic value of matrices created from a probability model comprising the steps of:
- providing a probability model; and
  
  extrapolating from the probability model an expected characteristic value of matrices created from the probability model.
- View Dependent Claims (53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71)
- - 53. The program product according to claim 52, wherein the characteristic value is a singular value.
  - 54. The program product according to claim 52, wherein the characteristic value is an eigenvalue.
  - 55. The program product according to claim 52, wherein the probability model comprises:
    - a probability distribution corresponding to a set of elements;
      
      a probability distribution corresponding to a set of sample lengths; and
      
      a probability of a sample from the probability model.
  - 56. The program product according to claim 52, wherein the step of extrapolating from the probability model comprises the steps of:
    - constructing a polynomial corresponding to the probability model; and
      
      extrapolating from the polynomial to obtain the characteristic value.
  - 57. The program product according to claim 56, wherein the step of extrapolating from the polynomial comprises the steps of:
    - finding a root of the polynomial for the probability model; and
      
      extrapolating from the root of the polynomial to obtain the characteristic value.
  - 58. The program product according to claim 57, wherein the step of extrapolating from the root of the polynomial comprises the steps of:
    - multiplying the root by an expected number of samples from the probability model; and
      
      taking the square root of the resulting value to obtain the characteristic value.
  - 59. The program product according to claim 56, wherein the step of constructing the polynomial comprises the steps of:
    - computing one or more coefficients c_i; and
      
      dividing each computed coefficient c_iby i! to obtain the polynomial $λ^{t} - \frac{c_{1}}{1!} λ^{t - 1} + \frac{c_{2}}{2!} λ^{t - 2} - \dots \pm \frac{c_{t}}{t!} λ^{0}$ wherein each of the non-computed coefficients is set to zero and t is from the probability model.
  - 60. The program product according to claim 59, wherein the step of computing a coefficient c_iof the polynomial comprises computing a probabilistically weighted value $c_{i} = \sum$
    - l ⁢
      
      ⁢
      
      c i ⁡
      
      ( l ) ⁢
      
      ⁢
      
      prob ⁢
      
      ⁢
      
      ( l ) comprising an additional coefficient c_i(l) corresponding to a length l and probability prob(l) from the probability model to obtain the coefficient c_i.
  - 61. The program product according to claim 60, wherein the step of computing the coefficient c_i(l) of the polynomial for the length l comprises computing a determinant of a Gram matrix |B_i^TB_i|, wherein B_iis a matrix whose i columns are i copies of a column vector of probabilities from the probability model that depend on l, to obtain the coefficient c_i(l).
  - 62. The program product according to claim 60, wherein the step of computing the coefficient c_i(l) of the polynomial for the length l comprises using a recursive formula to obtain the coefficient c_i(l).
  - 63. The program product according to claim 62, wherein the recursive formula comprises $c_{n + 1}$
    - ( l ) = ∑
      
      j = 0 n ⁢
      
      ( n j ) ⁢
      
      ( - 1 ) j ⁢
      
      ( j ! ) ⁢
      
      a j + 1 ⁢
      
      c n - j ⁡
      
      ( l ) based on the length l, coefficients c_n−
      
      j(l) with c₀(l)=1, and traces of powers of a second matrix M wherein a_n=trace(Mⁿ).
  - 64. The program product according to claim 63, wherein the second matrix M comprises expected values of products of pairs of probabilities according to the corresponding length l from the probability model.
  - 65. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1 - {(1 - p_{i})}^{l}, & i = j \\ 1 - {(1 - p_{i})}^{l} - {(1 - p_{j})}^{l} - {(1 - p_{i} - p_{j})}^{l}, & i \neq \end{matrix}$
    - j wherein p_iand p_jand l are from the probability model.
  - 66. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1, & i = j \\ 1 - 2 \end{matrix}$
    - ( 1 - p i ) l - 2 ⁢
      
      ( 1 - p j ) l + 4 ⁢
      
      ( 1 - p i - p j ) l , i ≠
      
      j wherein p_iand p_jand l are from the probability model.
  - 67. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = \sum$
    - [ j 1 + ⋯
      
      + j t = l ] ⁢
      
      log ⁢
      
      ⁢
      
      j i ⁢
      
      log ⁢
      
      ⁢
      
      j j ⁡
      
      ( l j 1 ⁢
      
      ⁢
      
      ⋯
      
      ⁢
      
      ⁢
      
      j t ) ⁢
      
      p 1 j 1 ⁢
      
      ⁢
      
      ⋯
      
      ⁢
      
      ⁢
      
      p t j t wherein p_iand p_jand l are from the probability model.
  - 68. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} (l \end{matrix}$
    - ( l - 1 ) ⁢
      
      p i 2 + lp i ) , i = j l ⁡
      
      ( l - 1 ) ⁢
      
      p i ⁢
      
      p j , i ≠
      
      j wherein p_iand p_jand l are from the probability model.
  - 69. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = \sum$
    - [ j 1 + ⋯
      
      + j t = l ] ⁢
      
      j i ⁢
      
      j j ⁡
      
      ( l j 1 ⁢
      
      ⁢
      
      ⋯
      
      ⁢
      
      ⁢
      
      j t ) ⁢
      
      p 1 j 1 ⁢
      
      ⁢
      
      ⋯
      
      ⁢
      
      ⁢
      
      p t j t wherein p_iand p_jand l are from the probability model.
  - 70. The program product according to claim 63, wherein the second matrix M comprises $M_{ij} = {\begin{matrix} 1 - t^{- l} \end{matrix}$
    - ( t - 1 ) l , i = j t l - 2 ⁢
      
      ( t - 1 ) l + ( t - 2 ) l t l = ∑
      
      k = 0 t - 3 ⁢
      
      ( t - 3 k ) ⁢
      
      ( t - k - 1 ) ! ⁢
      
      S ⁡
      
      ( l + 1 , t - k ) t l , i ≠
      
      j wherein t and l are from the probability model and S(n,k) are the Stirling numbers of the second kind.
  - 71. The program product according to claim 63, wherein the step of computing traces of powers of the second matrix M comprises summing j+1 powers of eigenvalues of the second matrix M to obtain the value a_j+1for use in the recursive formula.

Specification

Selective Latent Semantic Indexing Method for Information Retrieval Applications

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

15 Citations

71 Claims

Specification

Solutions

Use Cases

Quick Links

Selective Latent Semantic Indexing Method for Information Retrieval Applications

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

71 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links