×

Compression of language model structures and word identifiers for automated speech recognition systems

  • US 20040138884A1
  • Filed: 01/13/2003
  • Published: 07/15/2004
  • Est. Priority Date: 01/13/2003
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for compressing a language model of a continuous speech recognition system, wherein the language model is an N-gram including a plurality of arrays A[.], each array A[.] including ordered positive integer values stored in a memory, the integers representing a vocabulary of the language mode, the compressing for each array A[.] comprising:

  • defining an inverse array I[.] of the ordered array A[.] as I[j]=inverse(A[.])=first(j+k, A[.]);

    k=arg minl(j+lε

    A[.]), l≧

    0, where a function first(j, A[.]) returns a location of a first instance of an jth entry in the array A[.], and a function minl(j+lε

    A[.]) returns a smallest value l such that j+l is an entry in the array A[.];

    defining a kth split inverse Ik[.]=splitinversek(A[.]) of the array A[.] as Ik[.]=inverse(A[.]>

    >

    k), where A[.]>

    >

    k is an array obtained by right-shifting each entry of the array A[.] by k bits;

    defining a kth split of the array A[.] as Ak[.]=A[.]&

    Mask[k], where Mask [k]=

    j=0
    k-1


    2k·

    A

    [·

    ]
    represents an array obtained by masking all but the last k bits of each of entry of the array A[.];

    defining a null array A0[.] having the same length as the array A[.], defining a function Size(A[.]) for returning a total number of bits required to store the array A[.] in a compressed form as Size(A[.])=length(A[.])×

    width(A[.]), where width(A[.])=ceil(log2(max(A[.]))) is the number of bits required to store a largest integer value in the array A[.];

    defining a function MinSize for determining a minimum number of bits required to store the array A[.] in terms of the split arrays and split inverse arrays by MinSize(A[.])=mink{Size(Ik[.])+Size(Ak[.])+Ck}, where Ck is overhead required to store pointers to the split arrays, length of the split arrays, and an indication that the array A[.] is stored in terms of the split arrays and the split inverse arrays;

    defining a function OptSize(.) for determining a size of the array A[.] in terms of the split arrays and the split inverse arrays as OptSize(A[.])=mink(OptSize(Ik[.])+Size(Ak[.])+Ck}, wherein {circumflex over (k)} is an optimal value of k;

    determining an optimal size for the array Aj[.] using the function OptSize;

    storing the array Aj[.] if kj=width(Aj[.]), and otherwise separating the array Aj[.] into the split inverse arrays Ikjj and the split arrays Akjj, and storing the split arrays Akjj, and setting the array Aj+l[.] equal to the array Ikjj, and repeating, beginning at the determining step, for j=j+1 to generate the arrays Ak00[.], Akll[.], Ak22[.], . . . , AkJ-lJ-l[.], and the array AJ[.], where J is the value of j upon completion.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×