×

Method for dynamic context scope selection in hybrid N-gram+LSA language modeling

  • US 6,778,952 B2
  • Filed: 09/12/2002
  • Issued: 08/17/2004
  • Est. Priority Date: 03/10/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. A method comprising:

  • computing a plurality of global probabilities of an input word based on a context having a dynamic scope determined by discounting words observed prior to the input word according to an exponential function, the context represented by a vector in a latent semantic analysis (LSA) space, wherein the vector representation is generated from at least one decomposition matrix of a singular value decomposition of a co-occurrence matrix, W, between M words in a vocabulary V and N documents in a text corpus T and wherein the vector representation {tilde over (v)}q at time q is defined as v~q=1nq



    p=1q


    λ

    (na-nn)


    (1-ɛ

    ip
    )


    uip

    S-1
    embedded imagewhere nq is the number of words observed up to time q, np is the number of words observed up to time p, ip is the index of the word observed at time p, ε

    ip is the normalized entropy of the word observed at time p within T, 0<

    λ



    1, uip is the left singular vector at time p of the singular value decomposition of W, and S is the diagonal matrix of singular values of the singular value decomposition of W;

    computing a plurality of local probabilities of the input word; and

    combining the local probabilities and the global probabilities to produce a language model probability for the input word.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×