Apparatus for generating a statistical sequence model called class bi-multigram model with bigram dependencies assumed between adjacent sequences
First Claim
1. An apparatus for generating a statistical class sequence model called a class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent multigrams, where each said multigram is a variable length sequence of maximum length N units, and where class labels are assigned to the sequences, said apparatus comprising:
- initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training strings of units, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and the bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating a class conditional probability distribution of the sequences and the bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with the current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string of units, the inventory of sequences with the class label assigned to each sequence, the current class conditional probability distribution of the sequences, and the current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize the likelihood of the input training string computed with associated input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequences to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and a backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at a first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model.
3 Assignments
0 Petitions
Accused Products
Abstract
An apparatus generates a statistical class sequence model called A class bi-multigram model from input training strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences. The number of times all sequences of units occur are counted, as well as the number of times all pairs of sequences of units co-occur in the input training strings. An initial bigram probability distribution of all the pairs of sequences is computed as the number of times the two sequences co-occur, divided by the number of times the first sequence occurs in the input training string. Then, the input sequences are classified into a pre-specified desired number of classes. Further, an estimate of the bigram probability distribution of the sequences is calculated by using an EM algorithm to maximize the likelihood of the input training string computed with the input probability distributions. The above processes are then iteratively performed to generate statistical class sequence model.
140 Citations
32 Claims
-
1. An apparatus for generating a statistical class sequence model called a class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent multigrams, where each said multigram is a variable length sequence of maximum length N units, and where class labels are assigned to the sequences, said apparatus comprising:
-
initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training strings of units, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and the bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating a class conditional probability distribution of the sequences and the bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with the current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string of units, the inventory of sequences with the class label assigned to each sequence, the current class conditional probability distribution of the sequences, and the current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize the likelihood of the input training string computed with associated input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequences to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and a backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at a first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
wherein said initialization means withdraws from the inventory of registered sequences, the sequences occurring a number of times which is less than a prespecified number of times in the input training string of units. -
3. The apparatus as claimed in claim 2,
wherein the ending condition is that iterations, each including of the process of said classification means and the process of said reestimation means, have been performed a pre-specified number of times. -
4. The apparatus as claimed in claim 2,
wherein said classifications means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
5. The apparatus as claimed in claim 4,
wherein said equation is an equation for calculating the bigram probability between two sequences of units including first and second sequences, where the first sequences of units is followed by the second sequence of units which is a sequence of units to be processed, for each sequence of units to be processed in the input training string of units; - and
wherein the bigram probability between two sequences of units is obtained by dividing the sum of the likelihoods of all the segmentations containing the first and the second sequences of units, by the sum of the likelihoods of all the segmentation containing the first sequence of units.
- and
-
6. The apparatus as claimed in claim 5,
wherein said equation has a denominator representing the average number of occurrences of each sequence of units in the input training string of units, and a numerator representing the average number of co-occurrences of each pair of sequences of units where the first sequence of units is followed by the second sequence of units in the input training strings of units, wherein said numerator is the sum of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units; - and
wherein said denominator is the sum for all the sequences in the inventory of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence, the probability of the class of the sequence conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units.
- and
-
7. The apparatus as claimed in claim 4,
wherein the ending condition is that iterations, each including of the process of said classification means and the process of said reestimation means, have been performed a pre-specified number of times. -
8. The apparatus as claimed in claim 2,
wherein said equation is an equation for calculating the bigram probability between two sequences of units including first and second sequences, where the first sequences of units is followed by the second sequence of units which is a sequence of units to be processed, for each sequence of units to be processed in the input training string of units; - and
wherein the bigram probability between two sequences of units is obtained by dividing the sum of the likelihoods of all the segmentations containing the first and the second sequences of units, by the sum of the likelihoods of all the segmentations containing the first sequence of units.
- and
-
9. The apparatus as claimed in claim 8,
wherein said equation has a denominator representing the average number of occurences of each sequence of units in the input training string of units, and a numerator representing the average number of co-occurences of each pair of sequences of units where the first sequence of units is followed by the second sequence of units in the input training strings of units, wherein said numerator is the sum of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units; - and
wherein said denominator is the sum for all the sequences in the inventory of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence, the probability of the class of the sequence conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units.
- and
-
10. The apparatus as claimed in claim 1,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
11. The apparatus as claimed in claim 10,
wherein said equation is an equation for calculating the bigram probability between two sequences of units including first and second sequences, where the first sequences of units is followed by the second sequence of units which is a sequence of units to be processed, for each sequence of units to be processed in the input training string of units; - and
wherein the bigram probability between two sequences of units is obtained by dividing the sum of the likelihoods of all the segmentations containing the first and the second sequences of units, by the sum of the likelihoods of all the segmentations containing the first sequence of units.
- and
-
12. The apparatus as claimed in claim 11,
wherein said equation has a denominator representing the average number of occurrences of each sequence of units in the input training string of units, and a numerator representing the average number of co-occurrences of each pair of sequences of units where the first sequence of units is followed by the second sequence of units in the input training strings of units, wherein said numerator is the sum of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units; - and
wherein said denominator is the sum for all the sequences in the inventory of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence, the probability of the class of the sequence conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units.
- and
-
13. The apparatus as claimed in claim 10,
wherein the ending condition is that iterations, each including of the process of said classification means and the process of said reestimation means, have been performed a pre-specified number of times. -
14. The apparatus as claimed in claim 1,
wherein said equation is an equation for calculating the bigram probability between two sequences of units including first and second sequences, where the first sequences of units is followed by the second sequence of units which is a sequence of units to be processed, for each sequence of units to be processed in the input training string of units; - and
wherein the bigram probability between two sequences of units is obtained by dividing the sum of the likelihoods of all the segmentations containing the first and the second sequences of units, by the sum of the likelihoods of all the segmentations containing the first sequence of units.
- and
-
15. The apparatus as claimed in claim 14,
wherein said equation has a denominator representing the average number of occurences of each sequence of units in the input training string of units, and a numerator representing the average number of co-occurences of each pair of sequences of units where the first sequence of units is followed by the second sequence of units in the input training strings of units, wherein said numerator in the sum of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed conditioned by the class of the sequence preceding the sequence of units to be processed, and the backward likelihood of the input training string of units; - and
wherein said denominator is the sum for all the sequences in the inventory of the products of the forward likelihood of the input training string of units, the class conditional probability of the sequence, the probability of the class of the sequence conditioned by the class of the sequence preceding the sequence of units to be processed, and the likelihood of the input training string of units.
- and
-
16. The apparatus as claimed in claim 1,
wherein the ending condition is that iterations, each including of the process of said classifications means and the process of said reestimation means, have been performed a pre-specified number of times.
-
-
17. An apparatus for generating a statistical language model, said apparatus comprising:
-
an apparatus for generating a statistical class sequence model called a class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences, wherein said apparatus for generating a statistical class sequence model comprises;
initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training string, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and the bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating a class conditional probability distribution of the sequences and the bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with a current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string, the inventory of sequences with the class label assigned to each sequence, the current class conditional probability distribution of the sequences, and a current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize the likelihood of the input training string computed with the input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and the backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at a first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model, wherein each of said units in said input training string is a letter of an alphabet of a natural language, wherein each of said sequences is a morphem or a word, wherein said classification means classifies sequences of letters into a pre-specified number of classes of sequences of letters, and wherein said statistical sequence model is a statistical language model. - View Dependent Claims (18, 19, 20)
wherein said initialization means withdraws from the inventory of registered sequences, the sequences occurring a number of times which is less than a pre-specified number of times in the input training string of units. -
19. The apparatus as claimed in claim 18,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
20. The apparatus as claimed in claim 17,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations.
-
-
21. An apparatus for generating a statistical language model, said apparatus comprising:
-
an apparatus for generating a statistical class sequence model called class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences, wherein said apparatus for generating a statistical class sequence model comprises;
initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training strings, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and the bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating the class conditional probability distribution of the sequences and the bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with the current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string of units, the inventory of sequences with the class label assigned to each sequence, a current class conditional probability distribution of the sequences, and a current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize a likelihood of the input training string computed with the associated input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and the backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at a first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model, wherein each of said units in said input training string is a word of a natural language, wherein each of said sequences is a phrase, wherein said classification means classifies sequences of letters into a pre-specified number of classes of sequences of phrases, and wherein said statistical sequence model is a statistical language model. - View Dependent Claims (22, 23, 24)
wherein said initialization means withdraws from the inventory of registered sequences, the sequences occurring a number of times which is less than a pre-specified number of times in the input training string of units. -
23. The apparatus as claimed in claim 22,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
24. The apparatus as claimed in claim 21,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations.
-
-
25. A speech recognition apparatus comprising speech recognition means for recognizing speech by using a predetermined statistical language model based on an input speech utterance, said apparatus comprising an apparatus for generating a statistical class sequence model called a class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences,
wherein said apparatus for generating a statistical class sequence model comprises: -
initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training strings, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and the bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating a class conditional probability distribution of the sequences and the bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with the current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string of units, the inventory of sequences with the class label assigned to each sequence, a current class conditional probability distribution of the sequences, and a current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize a likelihood of the input training string computed with the input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and a backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at the first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model, wherein each of said units in said input training string is a letter of an alphabet of a natural language, wherein each of said sequences is a morphem or a word, wherein said classification means classifies sequences of letters into a pre-specified number of classes of sequences of letters, and wherein said statistical sequence model is a statistical language model, and wherein said speech recognition means recognizes speech with reference to the statistical language model generated by said apparatus for generating the statistical language model based on the input speech utterance, and outputting a speech recognition result. - View Dependent Claims (26, 27, 28)
wherein said initialization means withdraws from the inventory of registered sequences, the sequences occurring a number of times which is less than a pre-specified number of times in the input training string of units. -
27. The apparatus as claimed in claim 26,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
28. The apparatus as claimed in claim 25,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations.
-
-
29. A speech recognition apparatus comprising speech recognition means for recognizing speech by using a predetermined statistical language model based on an input speech utterance, said apparatus comprising an apparatus for generating a statistical class sequence model called a class bi-multigram model from input strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences,
wherein said apparatus for generating a statistical class sequence model comprises: -
initialization means for taking as an input a training string of units, registering in an inventory of sequences all the combinations of 1 to N units occurring in the input training string, counting the number of times all sequences of units occur and the number of times all pairs of sequences of units co-occur in the input training strings, computing an initial bigram probability distribution of all the pairs of sequences as the counted number of times the two sequences co-occur divided by the counted number of times the first sequence occurs in the input training string, and outputting the inventory of sequences and the initial bigram probability distribution of the sequences in the inventory;
classification means for taking as an input the inventory of sequences and a bigram probability distribution of the sequences in the inventory, classifying the input sequences into a pre-specified desired number of classes, by first assigning each sequence to its own class, and then repeatedly updating a class conditional probability distribution of the sequences and a bigram probability distribution of the classes and merging the pairs of classes for which a loss in mutual information computed with current class probability distributions is minimal, until a desired number of classes is obtained, and outputting the inventory of sequences with the class label assigned to each sequence, the class conditional probability distribution of the sequences, and the bigram probability distribution of the classes;
reestimation means for taking as an input the training string, the inventory of sequences with the class label assigned to each sequence, the current class conditional probability distribution of the sequences, and the current bigram probability distribution of the classes which are outputted from said classification means, calculating an estimate of the bigram probability distribution of the sequences by using an EM algorithm to maximize a likelihood of the input training string computed with the input probability distributions, and outputting the inventory of sequences with the bigram probability distribution of the sequences, the process of said reestimation means being performed with a forward-backward algorithm, by using an equation where a bigram probability between a sequence to be processed and a preceding sequence is calculated from a forward likelihood of the input training string which can be taken forward in a time series, the class conditional probability of the sequence to be processed, the probability of the class of the sequence to be processed knowing the class of the preceding sequence, and a backward likelihood of the input training string which can be taken backward in the time series; and
control means for controlling said classification means and said reestimation means to iteratively execute the process of said classification means and said reestimation means, the input of said classification means being, at a first iteration, the output of the said initialization means, and, during subsequent iterations, the output of said reestimation means, and the input of said reestimation means being the output of said classification means, until a predetermined ending condition is satisfied, thereby generating a statistical class sequence model, wherein each of said units in said input training string is a word of a natural language, wherein each of said sequences is a phrase, wherein said classification means classifies sequences of letters into a pre-specified number of classes of sequences of phrases, and wherein said statistical sequence model is a statistical language model, and wherein said speech recognition means recognizes speech with reference to the statistical language model generated by said apparatus for generating the statistical language model based on the input speech utterance, and outputting a speech recognition result. - View Dependent Claims (30, 31, 32)
wherein said initialization means withdraws from the inventory of registered sequences, the sequences occurring a number of times which is less than a pre-specified number of times in the input training string of units. -
31. The apparatus as claimed in claim 30,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations. -
32. The apparatus as claimed in claim 29,
wherein said classification means classifies the sequences into a pre-specified number of classes by applying the Brown algorithm to an input bigram probability distribution of the sequences computed by said initialization means at the first iteration, and by said reestimation means during the subsequent iterations.
-
Specification