Method and system for building a topic specific language model for use in automatic speech recognition

US 9,697,821 B2
Filed: 12/16/2013
Issued: 07/04/2017
Est. Priority Date: 01/29/2013
Status: Active Grant

First Claim

Patent Images

1. An automatic speech recognition method comprising:

at a computer having one or more processors and memory for storing one or more programs to be executed by the processors;

obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus;

obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category;

obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models;

constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and

decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech,wherein obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus further comprises;

calculating an affiliation matrix between terms based on the raw corpus;

extracting term characteristics from the raw corpus using a term frequency—

inverse document frequency (TF-IDF) method;

implementing a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and

inputting the term characteristics after the dimension reduction into a classifier for training, and outputting the plurality of speech corpus categories;

wherein calculating an affiliation matrix between terms based on the raw corpus further comprises;

calculating co-occurrence rates between each term and any other term using equation

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automatic speech recognition method includes at a computer having one or more processors and memory for storing one or more programs to be executed by the processors, obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus; obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category; obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models; constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech.

32 Citations

View as Search Results

8 Claims

1. An automatic speech recognition method comprising:
- at a computer having one or more processors and memory for storing one or more programs to be executed by the processors;
  
  obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus;
  
  obtaining a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category;
  
  obtaining an interpolation language model through implementing a weighted interpolation on each classified language model and merging the interpolated plurality of classified language models;
  
  constructing a decoding resource in accordance with an acoustic model and the interpolation language model; and
  
  decoding input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech,wherein obtaining a plurality of speech corpus categories through classifying and calculating raw speech corpus further comprises;
  
  calculating an affiliation matrix between terms based on the raw corpus;
  
  extracting term characteristics from the raw corpus using a term frequency—
  
  inverse document frequency (TF-IDF) method;
  
  implementing a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and
  
  inputting the term characteristics after the dimension reduction into a classifier for training, and outputting the plurality of speech corpus categories;
  
  wherein calculating an affiliation matrix between terms based on the raw corpus further comprises;
  
  calculating co-occurrence rates between each term and any other term using equation

2. The method according to claim 1, wherein the dimension reduction method is a principal components analysis (PCA) dimension reduction method.

3. The method according to claim 1, wherein the classifier is a support vector machine (SVM) classifier.

4. The method according to claim 1, wherein the weighted interpolation process is implemented on each classified language model based on an obscure degree of the respective speech corpus category, wherein the obscure degree of the speech corpus category is in a positive correlation with a weighted value.

5. An automatic speech recognition system comprising:
- one or more processors;
  
  memory for storing one or more programs to be executed by the processors;
  
  a classifying process module configured to obtain a plurality of speech corpus categories through classifying and calculating raw speech corpus;
  
  a classifying language model training module configure to obtain a plurality of classified language models that respectively correspond to the plurality of speech corpus categories through a language model training applied on each speech corpus category;
  
  a weight merging module configured to obtain an interpolation language model through implementing a weighted interpolation on each classified language model and merge the interpolated plurality of classified language models;
  
  a resource construction module configured to construct decoding resource in accordance with an acoustic model and the interpolation language model; and
  
  a decoder configured to decode input speech using the decoding resource, and outputting a character string with a highest probability as a recognition result of the input speech,wherein the classifying process module further comprises;
  
  an affiliation matrix module configured to calculate an affiliation matrix between terms based on the raw corpus;
  
  a characteristic extracting module configured to extract term characteristics from the raw corpus using a term frequency—
  
  inverse document frequency (TF-IDF) method;
  
  a dimension reduction module configured to implement a dimension reduction method on the extracted term characteristics based on the affiliation matrix; and
  
  a classifier configured to train the term characteristics after dimension reduction, and output the plurality of speech corpus categories;
  
  wherein the affiliation matrix module is further configured to;
  
  calculate co-occurrence rates between each term and any other term using equation

6. The system according to claim 5, wherein the dimension reduction module is a principal components analysis (PCA) dimension reduction module.

7. The system according to claim 5, wherein the classifier is a support vector machine (SVM) classifier.

8. The system according to claim 5, wherein the weighted interpolation is implemented on each classified language model based on an obscure degree of the respective speech corpus category, wherein the obscure degree of the speech corpus category is in a positive correlation with a weighted value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tencent Technology Company Limited (Tencent Holdings Limited)
Original Assignee
Tencent Technology Company Limited (Tencent Holdings Limited)
Inventors
Chen, Bo, Rao, Feng, Lu, Li, Wang, Eryu, Xie, Dadong, Li, Lou, Lu, Duling, Zhang, Xiang, Yue, Shuai
Primary Examiner(s)
Wozniak, James

Application Number

US14/108,223
Publication Number

US 20140214419A1
Time in Patent Office

1,296 Days
Field of Search

704243, 704251, 704257
US Class Current
CPC Class Codes

G10L 15/063   Training

G10L 15/183   using context dependencies,...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/26   Speech to text systems G10L...

Method and system for building a topic specific language model for use in automatic speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

32 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for building a topic specific language model for use in automatic speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

32 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links