Extended recognition dictionary learning device and speech recognition system

US 8,918,318 B2
Filed: 01/15/2008
Issued: 12/23/2014
Est. Priority Date: 01/16/2007
Status: Active Grant

First Claim

Patent Images

1. An extended recognition dictionary learning device comprising:

an utterance variation data calculating section configured to compare an acoustic model sequence obtained from a result of speech recognition for each of a plurality of speakers and a correct acoustic model sequence to calculate a correspondence between the models as utterance variation data;

an utterance variation data classifying section configured to classify the calculated utterance variation data into widely appearing utterance variations unevenly appearing utterance variations, the widely appearing utterance variations appearing independently of speakers in the calculated utterance variation data, and the unevenly appearing utterance variations appearing dependently of speakers in the calculated utterance variation data; and

a recognition dictionary extending section configured to define a plurality of utterance variation sets by combining the classified utterance variations and to generate a plurality of extended recognition dictionaries corresponding to the plurality of utterance variation sets by extending a recognition dictionary for each utterance variation set according to the utterance variations included in each utterance variation set, whereinthe plurality of utterance variation sets comprise;

a common utterance variation set that consists of only widely appearing utterance variations; and

utterance variation sets each of which is generated by combining widely appearing utterance variations and unevenly appearing utterance variations.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker. An extended recognition dictionary learning device includes an utterance variation data calculating section for comparing an acoustic model sequence output from a speech recognition result and an input correct acoustic model sequence to calculate a correspondence between the models as utterance variation data; an utterance variation data classifying section for classifying the calculated utterance variation data into widely appearing utterance variations and unevenly appearing utterance variations; and a recognition dictionary extending section for defining a plurality of utterance variation sets by combining the classified utterance variations and thereby extending the recognition dictionary for each utterance variation set according to the utterance variations included in each utterance variation set. A speech recognition device uses the extended recognition dictionary for each utterance variation set to output a speech recognition result.

27 Citations

11 Claims

1. An extended recognition dictionary learning device comprising:
- an utterance variation data calculating section configured to compare an acoustic model sequence obtained from a result of speech recognition for each of a plurality of speakers and a correct acoustic model sequence to calculate a correspondence between the models as utterance variation data;
  
  an utterance variation data classifying section configured to classify the calculated utterance variation data into widely appearing utterance variations unevenly appearing utterance variations, the widely appearing utterance variations appearing independently of speakers in the calculated utterance variation data, and the unevenly appearing utterance variations appearing dependently of speakers in the calculated utterance variation data; and
  
  a recognition dictionary extending section configured to define a plurality of utterance variation sets by combining the classified utterance variations and to generate a plurality of extended recognition dictionaries corresponding to the plurality of utterance variation sets by extending a recognition dictionary for each utterance variation set according to the utterance variations included in each utterance variation set, whereinthe plurality of utterance variation sets comprise;
  
  a common utterance variation set that consists of only widely appearing utterance variations; and
  
  utterance variation sets each of which is generated by combining widely appearing utterance variations and unevenly appearing utterance variations.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The extended recognition dictionary learning device according to claim 1, whereinthe utterance variation data classifying section includes:
    - a first calculating section for calculating utterance variations widely appearing in the utterance variation data as an idf utterance variation vector using an idf value of the utterance variation data; and
      
      a second calculating section for clustering the utterance variations unevenly appearing in the utterance variation data using a tfidf value calculated using a tf value of the utterance variation data and the idf value to calculate a cluster utterance variation vector, andthe recognition dictionary extending section constructs a plurality of utterance variation sets by using only utterance variations having a value of the idf utterance variation vector smaller than a predetermined value or by combining the utterance variations having a value of the idf utterance variation vector smaller than a predetermined value and utterance variations having a value of the cluster utterance variation vector larger than a predetermined value.
  - 3. The extended recognition dictionary learning device according to claim 2, whereinthe recognition dictionary extending section constructs the same number of utterance variation sets as the number of clusters by including in each utterance variation set both the utterance variations having a value of the idf utterance variation vector smaller than a predetermined value and utterance variations having a value of the cluster utterance variation vector larger than a predetermined value.
  - 4. The extended recognition dictionary learning device according to claim 3, whereinthe recognition dictionary extending section constructs the number of utterance variation sets larger by one than the number of clusters by further constructing an utterance variation set including the utterance variations having a value of the idf utterance variation vector smaller than a predetermined value in addition to the same number of utterance variation sets as the number of clusters.
  - 5. The extended recognition dictionary learning device according to claim 1, whereinthe recognition dictionary extending section extends the recognition dictionary to construct an extended recognition dictionary for each utterance variation set by adding, to the recognition dictionary, items in which standard utterances included in the recognition dictionary are replaced with utterance variations included in each of the utterance variation sets under a rule that has previously been set as a recognition dictionary extension rule so as to allow utterance variations to be established as speech of a language to be recognized.
  - 6. The extended recognition dictionary learning device according to claim 2, whereinthe first calculating section calculates utterance variations widely appearing in the utterance variation data as the idf utterance variation vector by using the idf value of the utterance variation data represented by idf(X) calculated by the following equations:
  - 7. A speech recognition device comprisingspeech recognition section for performing speech recognition for input speech using the recognition dictionary generated for each utterance variation set that has been learned by the extended recognition dictionary learning device as claimed in claim 1.
  - 8. The speech recognition device according to claim 7, whereinthe speech recognition section selects as hypothesis a final recognition result from recognition results obtained for each extended recognition dictionary based on a majority decision method so as to output the final recognition result.
  - 9. A speech recognition system utilizing the extended recognition dictionary learning device as claimed in claim 1.

10. An extended recognition dictionary learning method, comprising:
- a step of comparing an acoustic model sequence obtained from a result of speech recognition for each of a plurality of speakers and a correct acoustic model sequence to calculate a correspondence between the models as utterance variation data;
  
  a step of classifying the calculated utterance variation data into widely appearing utterance variations and unevenly appearing utterance variations, the widely appearing utterance variations appearing independently of speakers in the calculated utterance variation data, and the unevenly appearing utterance variations appearing dependently of speakers in the calculated utterance variation data; and
  
  a step of defining a plurality of utterance variation sets by combining the classified utterance variations and generating a plurality of extended recognition dictionaries corresponding to the plurality of utterance variation sets by extending a recognition dictionary for each utterance variation set according to the utterance variations included in each utterance variation set, whereinthe plurality of utterance variation sets comprise;
  
  a common utterance variation set that consists of only widely appearing utterance variations; and
  
  utterance variation sets each of which is generated by combining widely appearing utterance variations and unevenly appearing utterance variations.

11. A non-transitory storage medium having recorded thereon an extended recognition dictionary learning program which, when executed by a computer, causes the computer to execute:
- a step of comparing an acoustic model sequence obtained from a result of speech recognition for each of a plurality of speakers and a correct acoustic model sequence to calculate a correspondence between the models as utterance variation data;
  
  a step of classifying the calculated utterance variation data into widely appearing utterance variations and unevenly appearing utterance variations, the widely appearing utterance variations appearing independently of speakers in the calculated utterance variation data, and the unevenly appearing utterance variations appearing dependently of speakers in the calculated utterance variation data; and
  
  a step of defining a plurality of utterance variation sets by combining the classified utterance variations and generating a plurality of extended recognition dictionaries corresponding to the plurality of utterance variation sets by extending a recognition dictionary for each utterance variation set according to the utterance variations included in each utterance variation set, whereinthe plurality of utterance variation sets comprise;
  
  a common utterance variation set that consists of only widely appearing utterance variations; and
  
  utterance variation sets each of which is generated by combining widely appearing utterance variations and unevenly appearing utterance variations.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Corporation
Inventors
Onishi, Yoshifumi
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US12/523,302
Publication Number

US 20100023329A1
Time in Patent Office

2,534 Days
Field of Search

704/251, 704/270.1, 704/9, 704/4, 704/3, 704/270, 704/257, 704/243, 704/235, 704/231, 704/10, 707/740, 434/308, 379/88.14, 379/88.03, 379/265.02, 345/173, 341/106
US Class Current

704/244
CPC Class Codes

G10L 15/07 to the speaker

G10L 2015/0635 updating or merging of old ...

Extended recognition dictionary learning device and speech recognition system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

27 Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Extended recognition dictionary learning device and speech recognition system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

27 Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links