Method and system for building a topic specific language model for use in automatic speech recognition
First Claim
1. An automatic speech recognition method comprising:
- at a computer having one or more processors and memory for storing one or more programs to be executed by the processors;
obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus;
obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category;
obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models;
constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and
decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech,wherein obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus further comprises;
calculating an affiliation matrix between terms based on the raw corpus;
extracting term characteristics from the raw corpus using a term frequency—
inverse document frequency (TF-IDF) method;
implementing a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and
inputting the term characteristics after the dimension reduction into a classifier for training, and outputting the plurality of speech corpus categories;
wherein calculating an affiliation matrix between terms based on the raw corpus further comprises;
calculating co-occurrence rates between each term and any other term using equation
1 Assignment
0 Petitions
Accused Products
Abstract
An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.
32 Citations
8 Claims
-
1. An automatic speech recognition method comprising:
-
at a computer having one or more processors and memory for storing one or more programs to be executed by the processors; obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech, wherein obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus further comprises; calculating an affiliation matrix between terms based on the raw corpus; extracting term characteristics from the raw corpus using a term frequency—
inverse document frequency (TF-IDF) method;implementing a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and inputting the term characteristics after the dimension reduction into a classifier for training, and outputting the plurality of speech corpus categories; wherein calculating an affiliation matrix between terms based on the raw corpus further comprises; calculating co-occurrence rates between each term and any other term using equation
-
-
2. The method according to claim 1, wherein the dimension reduction method is a principal components analysis (PCA) dimension reduction method.
-
3. The method according to claim 1, wherein the classifier is a support vector machine (SVM) classifier.
-
4. The method according to claim 1, wherein the weighted interpolation process is implemented on each classified language model based on an obscure degree of the respective speech corpus category, wherein the obscure degree of the speech corpus category is in a positive correlation with a weighted value.
-
5. An automatic speech recognition system comprising:
-
one or more processors; memory for storing one or more programs to be executed by the processors; a classifying process module configured to obtain a plurality of speech corpus categories through classifying and calculating raw speech corpus; a classifying language model training module configure to obtain a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; a weight merging module configured to obtain an interpolation language model through implementing a weighted interpolation on each classified language model and merge the interpolated plurality of classified language models; a resource construction module configured to construct decoding resource in accordance with an acoustic model and the interpolation language model; and a decoder configured to decode input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech, wherein the classifying process module further comprises; an affiliation matrix module configured to calculate an affiliation matrix between terms based on the raw corpus; a characteristic extracting module configured to extract term characteristics from the raw corpus using a term frequency—
inverse document frequency (TF-IDF) method;a dimension reduction module configured to implement a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and a classifier configured to train the term characteristics after dimension reduction, and output the plurality of speech corpus categories; wherein the affiliation matrix module is further configured to; calculate co-occurrence rates between each term and any other term using equation
-
-
6. The system according to claim 5, wherein the dimension reduction module is a principal components analysis (PCA) dimension reduction module.
-
7. The system according to claim 5, wherein the classifier is a support vector machine (SVM) classifier.
-
8. The system according to claim 5, wherein the weighted interpolation is implemented on each classified language model based on an obscure degree of the respective speech corpus category, wherein the obscure degree of the speech corpus category is in a positive correlation with a weighted value.
Specification