Method for dynamic context scope selection in hybrid N-gram+LSA language modeling
First Claim
Patent Images
1. A method comprising:
- computing a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as where nq is the number of words observed up to time q, np is the number of words observed up to time p, ip is the index of the word observed at time p, ε
ip is the normalized entropy of the word observed at time p within T, 0<
λ
≦
1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W;
computing a plurality of local probabilities of the input word; and
combining the local probabilities and the global probabilities to produce a language model probability for the input word.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for dynamic language modeling of a document are described. In one embodiment, a number of local probabilities of a current document are computed and a vector representation of the current document in a latent semantic analysis (LSA) space is determined. In addition, a number of global probabilities based upon the vector representation of the current document in an LSA space is computed. Further, the local probabilities and the global probabilities are combined to produce the language modeling.
-
Citations
20 Claims
-
1. A method comprising:
-
computing a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as where nq is the number of words observed up to time q, np is the number of words observed up to time p, ip is the index of the word observed at time p, ε
ip is the normalized entropy of the word observed at time p within T, 0<
λ
≦
1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W;computing a plurality of local probabilities of the input word; and
combining the local probabilities and the global probabilities to produce a language model probability for the input word. - View Dependent Claims (2, 3, 4, 5)
-
-
3. The method of claim 1, wherein the local probabilities are based on an n-gram paradigm.
-
4. The method of claim 1, wherein the plurality of local probabilities Pr(wq|Hq(t)) for the input word wq, given a local context of n−
- 1 words, wq−
1, wq−
2,. . . wq−
n+1, is defined as
- 1 words, wq−
-
5. The method of claim 1 wherein the language model probability Pr(wq|{tilde over (H)}q−
- 1) for the input word wq within a given a local context of n−
1 words, wq−
1, wq−
2,. . . wq−
n+1, the context {tilde over (d)}q−
1, and a vocabulary V is defined as
- 1) for the input word wq within a given a local context of n−
-
6. A machine-readable medium having executable instructions to cause the machine to perform a method comprising:
-
computing a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as where nq is the number of words observed up to time q, np is the number of words observed up to time p, ip is the index of the word observed at time p, ε
ip is the normalized entropy of the word observed at time p within T, 0<
λ
≦
1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W;computing a plurality of local probabilities of the input word; and
combining the local probabilities and the global probabilities to produce a language model probability for the input word. - View Dependent Claims (7, 8, 9, 10, 17)
-
-
8. The machine-readable medium of claim 6, wherein the local probabilities are based on an n-gram paradigm.
-
9. The machine-readable medium of claim 6, wherein the plurality of local probabilities Pr(wq|Hq(t)) for the input word wq, given a local context of n−
- 1 words, wq−
1, wq−
2, . . . wq−
n+1, is defined as
- 1 words, wq−
-
10. The machine-readable medium of claim 6, wherein the language model probability Pr(wq|{tilde over (H)}q−
- 1) for the input word wq within a given a local context of n−
1 words, wq−
1, wq−
2, . . . wq−
n+1, the context {tilde over (d)}q−
1, and a vocabulary V is defined as
- 1) for the input word wq within a given a local context of n−
-
17. The apparatus of claim 6, wherein the plurality of global probabilities Pr(wq|Hq−
- 1) for the input word wq given the context {tilde over (d)}q−
1 is defined as
- 1) for the input word wq given the context {tilde over (d)}q−
-
11. A system comprising:
-
a processor coupled to a memory through a bus; and
a language modeling process executed from the memory by the processor to cause the processor to compute a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as where nq is the number of words observed up to time q, np is the number of words observed up to time p, ip is the index of the word observed at time p, ε
ip is the normalized entropy of the word observed at time p within T, 0<
λ
≦
1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W,compute a plurality of local probabilities of the input word, and combine the local probabilities and the global probabilities to produce a language model probability for the input word. - View Dependent Claims (12, 13, 14, 15)
-
-
13. The system of claim 11, wherein the local probabilities are based on an n-gram paradigm.
-
14. The system of claim 11, wherein the plurality of local probabilities Pr(wq|Hq(t)) for the input word wq, given a local context of n−
- 1 words, wq−
1, wq−
2, . . . wq−
n+1, is defined as
- 1 words, wq−
-
15. The system of claim 11, wherein the language model probability Pr(wq|{tilde over (H)}q−
- 1) for the input word wq within a given a local context of n−
1 words, wq−
1, wq−
2, . . . wq−
n+1, the context {tilde over (d)}q−
1, and a vocabulary V is defined as
- 1) for the input word wq within a given a local context of n−
-
16. An apparatus comprising:
-
means for computing a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as where nq is the number of words observed up to time q, np, is the number of words observed up to time p, ip is the index of the word observed at time p, ε
ip is the normalized entropy of the word observed at time p within T, 0<
λ
≦
1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W;means for computing a plurality of local probabilities of the input word; and
means for combining the local probabilities and the global probabilities to produce a language model probability for the input word. - View Dependent Claims (18, 19, 20)
-
-
20. The apparatus of claim 16, wherein the language model probability Pr(wq|{tilde over (H)}q−
- 1) for the input word wq within a given a local context of n−
1 words, wq−
1, wq−
2, . . . wq−
n+1, the context {tilde over (d)}q−
1, and a vocabulary V is defined as
- 1) for the input word wq within a given a local context of n−
Specification