Cluster and pruning-based language model compression
First Claim
1. A computer-implemented method for compressing a word language model comprising:
- predictive clustering the word language model, such that the model after clustering has a larger size than before clustering and the word language model provides parameter values that describe the probabilities in a function P(Z|xy)×
P(z|xyz), where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides; and
, pruning the word language model as clustered utilizing an entropy-based pruning technique.
2 Assignments
0 Petitions
Accused Products
Abstract
Cluster- and pruning-based language model compression is disclosed. In one embodiment, a language model is first clustered, such as by using predictive clustering. The language model after clustering has a larger size than it did before clustering. The language model is then pruned, such as by using entropy-based techniques, such as Rosenfeld pruning, or by using Stolcke pruning or count-cutoff techniques. In one particular embodiment, a word language model is first predictively clustered by a technique described as P(Z|xy)×P(z|xyZ), where a lower-case letter refers to a word, and an upper-cluster letter refers to a cluster in which the word resides.
39 Citations
38 Claims
-
1. A computer-implemented method for compressing a word language model comprising:
-
predictive clustering the word language model, such that the model after clustering has a larger size than before clustering and the word language model provides parameter values that describe the probabilities in a function P(Z|xy)×
P(z|xyz), where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides; and
,pruning the word language model as clustered utilizing an entropy-based pruning technique. - View Dependent Claims (2)
-
-
3. A computer-implemented method for compressing a language model comprising:
-
clustering the language model such that the language model provides parameter values that describe the probabilities in a function P(Z|xy)×
P(z|xyZ), where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides, such that the language model after clustering has a larger size than before clustering; and
,pruning the language model. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A machine-readable medium having stored thereon data representing a language model compressed by performance of a method comprising:
-
clustering the language model utilizing a technique such that the language model provides parameter values that describe the probabilities in a function P(Z|XY)×
P(z|XYZ) where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides, such that the language model after clustering has a larger size than before clustering; and
pruning the language model. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A machine-readable medium having data stored thereon representing a computer program designed to translate raw data into recognized data utilizing a language model, the language model compressed via performance of a method comprising:
-
clustering the language model utilizing a technique such that the language model provides parameter values that describe the probabilities in a function P(Z|xy)×
P(z|xyZ), where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides, such that the language model after clustering has a larger size than before clustering; and
pruning the language model. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A computerized system comprising:
-
an input device generating raw data;
a language model compressed via being first clustered utilizing a technique such that the language model provides parameter values that describe the probabilities in a function P(Z|XY)×
P(z|XYZ), where a lower-case letter refers to a word, and an upper-case letter refers to a cluster in which the word resides, such that the model after being clustered has a larger size than before being clustered, and then pruned; and
,a computer program designed to translate the raw data into recognized data utilizing the language model. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38)
-
Specification