Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity
First Claim
1. A word clustering apparatus for clustering a plurality of words and obtaining a total tree diagram representing a word clustering result, said total tree diagram including tree diagrams of an upper layer, a middle layer and a lower layer, said word clustering apparatus comprising:
- first storage means for storing class words within one window;
second storage means for storing a plurality of c classes of the middle layer;
third storage means for storing the tree diagram of the upper layer;
fourth storage means for storing the tree diagram of the lower layer;
fifth storage means for storing the total tree diagram;
first control means for detecting an appearance frequency of a plurality of v words which are different from one another in text data including a plurality of words, for arranging the v words in a descending order of appearance frequency, and for assigning the v words to a plurality of v classes;
second control means for storing as class words within one window into said first storage means, words of (c+1) classes having a high appearance frequency, the number (c+1) smaller than v, among the v words of the plurality of v classes assigned by said first control means;
third control means, in response to the class words within one window stored in said first storage means, for clustering the class words within one window into a plurality of c classes in a predetermined binary tree form, so that a predetermined average mutual information is maximized, the average mutual information representing a relative frequency rate of a probability when words of a first class and words of a second class which are different from each other appear adjacent to each other, with respect to a product of the appearance frequency of the words of the first class and the appearance frequency of the words of the second class, and for storing the plurality of clustered c classes as a plurality of c classes of the middle layer, into said second storage means;
fourth control means, in response to the plurality of c classes of the middle layer stored in said second storage means, for clustering the words of the plurality of clustered c classes of the middle layer in a binary tree form until the words of the middle layer are clustered into one class so that the average mutual information is maximized, and for storing a result of the clustering as the tree diagram of the upper layer into said third storage means;
fifth control means, in response to the plurality of words in each of the plurality of c classes of the middle layer stored in said second storage means, for clustering the plurality of words of each class of the middle layer into a binary tree form until the words of each class of the middle layer are clustered into one class, for every class of the plurality of c classes of the middle layer stored in said second storage means so that the average mutual information is maximized, and for storing a result of the clustering for each class as the tree diagram of the lower layer into said fourth storage means; and
sixth control means for obtaining the total tree diagram including the tree diagrams of the upper layer, the middle layer and the lower layer by connecting the tree diagram of the lower layer stored in said fourth storage means to the plurality of c classes of the middle layer stored in said second storage means, and connecting the tree diagram of the upper layer stored in said third storage means to the plurality of c classes of the middle layer stored in said second storage means, and for storing a resulting total tree diagram as a word clustering result into said fifth storage means.
1 Assignment
0 Petitions
Accused Products
Abstract
In a word clustering apparatus for clustering words, a plurality of words is clustered to obtain a total tree diagram of a word dictionary representing a word clustering result, where the total tree diagram includes tree diagrams of an upper layer, a middle layer and a lower layer. In a speech recognition apparatus, a microphone converts an input utterance speech composed of a plurality of words into a speech signal, and a feature extractor extracts predetermined acoustic feature parameters from the converted speech signal. Then, a speech recognition controller executes a speech recognition process on the extracted acoustic feature parameters with reference to a predetermined Hidden Markov Model and the obtained total tree diagram of the word dictionary, and outputs a result of the speech recognition.
290 Citations
6 Claims
-
1. A word clustering apparatus for clustering a plurality of words and obtaining a total tree diagram representing a word clustering result, said total tree diagram including tree diagrams of an upper layer, a middle layer and a lower layer, said word clustering apparatus comprising:
-
first storage means for storing class words within one window; second storage means for storing a plurality of c classes of the middle layer; third storage means for storing the tree diagram of the upper layer; fourth storage means for storing the tree diagram of the lower layer; fifth storage means for storing the total tree diagram; first control means for detecting an appearance frequency of a plurality of v words which are different from one another in text data including a plurality of words, for arranging the v words in a descending order of appearance frequency, and for assigning the v words to a plurality of v classes; second control means for storing as class words within one window into said first storage means, words of (c+1) classes having a high appearance frequency, the number (c+1) smaller than v, among the v words of the plurality of v classes assigned by said first control means; third control means, in response to the class words within one window stored in said first storage means, for clustering the class words within one window into a plurality of c classes in a predetermined binary tree form, so that a predetermined average mutual information is maximized, the average mutual information representing a relative frequency rate of a probability when words of a first class and words of a second class which are different from each other appear adjacent to each other, with respect to a product of the appearance frequency of the words of the first class and the appearance frequency of the words of the second class, and for storing the plurality of clustered c classes as a plurality of c classes of the middle layer, into said second storage means; fourth control means, in response to the plurality of c classes of the middle layer stored in said second storage means, for clustering the words of the plurality of clustered c classes of the middle layer in a binary tree form until the words of the middle layer are clustered into one class so that the average mutual information is maximized, and for storing a result of the clustering as the tree diagram of the upper layer into said third storage means; fifth control means, in response to the plurality of words in each of the plurality of c classes of the middle layer stored in said second storage means, for clustering the plurality of words of each class of the middle layer into a binary tree form until the words of each class of the middle layer are clustered into one class, for every class of the plurality of c classes of the middle layer stored in said second storage means so that the average mutual information is maximized, and for storing a result of the clustering for each class as the tree diagram of the lower layer into said fourth storage means; and sixth control means for obtaining the total tree diagram including the tree diagrams of the upper layer, the middle layer and the lower layer by connecting the tree diagram of the lower layer stored in said fourth storage means to the plurality of c classes of the middle layer stored in said second storage means, and connecting the tree diagram of the upper layer stored in said third storage means to the plurality of c classes of the middle layer stored in said second storage means, and for storing a resulting total tree diagram as a word clustering result into said fifth storage means. - View Dependent Claims (2)
-
-
3. A speech recognition apparatus comprising:
-
word clustering means for clustering a plurality of words and obtaining a total tree diagram of a word dictionary representing a word clustering result, said total tree diagram including tree diagrams of an upper layer, a middle layer and a lower layer; microphone means for converting an input utterance speech composed of a plurality of words into a speech signal; feature extracting means for extracting predetermined acoustic feature parameters from the speech signal converted by said microphone means; and speech recognition means for executing a speech recognition process on the acoustic feature parameters extracted by said feature extracting means, with reference to a predetermined Hidden Markov Model and the total tree diagram of the word dictionary obtained by said word clustering means, and for outputting a result of the speech recognition, wherein said word clustering means comprises; first storage means for storing class words within one window; second storage means for storing a plurality of c classes of the middle layer; third storage means for storing the tree diagram of the upper layer; fourth storage means for storing the tree diagram of the lower layer; fifth storage means for storing the total tree diagram; first control means for detecting an appearance frequency of a plurality of v words which are different from one another in text data including a plurality of words, for arranging the v words in a descending order of appearance frequency, and for assigning the v words to a plurality of v classes; second control means for storing as class words within one window into said first storage means, words of (c+1) classes having a high appearance frequency, the number (c+1) smaller than v, among the v words of the plurality of v classes assigned by said first control means; third control means, in response to the class words within one window stored in said first storage means, for clustering the class words into a plurality of c classes in a predetermined binary tree form, so that a predetermined average mutual information is maximized, the average mutual information representing a relative frequency rate of a probability when words of a first class and words of a second class which are different from each other appear adjacent to each other, with respect to a product of the appearance frequency of the words of the first class and the appearance frequency of the words of the second class, and for storing the plurality of clustered c classes as a plurality of c classes of the middle layer, into said second storage means; fourth control means, in response to the plurality of c classes of the middle layer stored in said second storage means, for clustering the words of the plurality of clustered c classes of the middle layer in a binary tree form until the words of the middle layer are clustered into one class so that the average mutual information is maximized, and for storing a result of the clustering as the tree diagram of the upper layer into said third storage means; fifth control means, in response to the plurality of words in each of the plurality of c classes of the middle layer stored in said second storage means, for clustering the plurality of words of each class of the middle layer into a binary tree form until the words of each class of the middle layer are clustered into one class every class of the plurality of c classes of the middle layer stored in said second storage means so that the average mutual information is maximized, and for storing a result of the clustering for each class as the tree diagram of the lower layer into said fourth storage means; and
sixth control means for obtaining the total tree diagram including the tree diagrams of the upper layer, the middle layer and the lower layer by connecting the tree diagram of the lower layer stored in said fourth storage means to the plurality of c classes of the middle layer stored in said second storage means, and connecting the tree diagram of the upper layer stored in said third storage means to the plurality of c classes of the middle layer stored in said second storage means, and for storing a resulting total tree diagram as a word clustering result into said fifth storage means. - View Dependent Claims (4)
-
-
5. A method for clustering a plurality of words and obtaining a total tree diagram representing a word clustering result, said total tree diagram including tree diagrams of an upper layer, a middle layer and a lower layer, said method including the following steps:
- detecting an appearance frequency of a plurality of v words which are different from one another in text data including a plurality of words, arranging the v words in a descending order of appearance frequency, and assigning the v words to a plurality of v classes;
storing as class words within one window into first storage means, words of (c+1) classes having a high appearance frequency, the number (c+1) smaller than v, among the v words assigned into the plurality of v classes;in response to the class words within one window stored in said first storage means, clustering the class words within one window into a plurality of c classes in a predetermined binary tree form, so that a predetermined average mutual information is maximized, the average mutual information representing a relative frequency rate of a probability when words of a first class and words of a second class which are different from each other appear adjacent to each other, with respect to a product of the appearance frequency of the words of the first class and the appearance frequency of the words of the second class, and storing the plurality of clustered c classes as a plurality of c classes of the middle layer, into second storage means; in response to the plurality of c classes of the middle layer stored in said second storage means, clustering the words of the plurality of clustered c classes of the middle layer in a binary tree form until the words of the middle layer are clustered into one class so that the average mutual information is maximized, and storing a result of the clustering as the tree diagram of the upper layer into third storage means; in response to the plurality of words in each of the plurality of c classes of the middle layer stored in said second storage means, clustering the plurality of words of each class of the middle layer into a binary tree form until the words of each class of the middle layer are clustered into one class, for every class of the plurality of c classes of the middle layer stored in said second storage means so that the average mutual information is maximized, and storing a result of the clustering for each class as the tree diagram of the lower layer into fourth storage means; and obtaining the total tree diagram including the tree diagrams of the upper layer, the middle layer and the lower layer by connecting the tree diagram of the lower layer stored in said fourth storage means to the plurality of c classes of the middle layer stored in said second storage means, and connecting the tree diagram of the upper layer stored in said third storage means to the plurality of c classes of the middle layer stored in said second storage means, and storing a resulting total tree diagram as a word clustering result into fifth storage means. - View Dependent Claims (6)
- detecting an appearance frequency of a plurality of v words which are different from one another in text data including a plurality of words, arranging the v words in a descending order of appearance frequency, and assigning the v words to a plurality of v classes;
Specification