Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program

US 8,595,004 B2
Filed: 11/27/2008
Issued: 11/26/2013
Est. Priority Date: 12/18/2007
Status: Active Grant

First Claim

Patent Images

1. A pronunciation variation rule extraction apparatus comprising:

a speech data storage unit which stores speech data;

a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data;

a sub word language model generation unit which generates a sub word language model from said base form pronunciation data;

a speech recognition unit which recognizes said speech data by using said sub word language model;

a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and

a language model weight control unit which controls one weight value for said sub word language model,wherein said language model weight control unit outputs a plurality of said weight values,wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, andwherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model. The difference extraction unit extracts a difference between a recognition result outputted from the speech recognition unit and the base form pronunciation data by comparing the recognition result and the base form pronunciation data.

Citations

16 Claims

1. A pronunciation variation rule extraction apparatus comprising:
- a speech data storage unit which stores speech data;
  
  a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data;
  
  a sub word language model generation unit which generates a sub word language model from said base form pronunciation data;
  
  a speech recognition unit which recognizes said speech data by using said sub word language model;
  
  a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and
  
  a language model weight control unit which controls one weight value for said sub word language model,wherein said language model weight control unit outputs a plurality of said weight values,wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, andwherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The pronunciation variation rule extraction apparatus according to claim 1, wherein, when said difference is smaller than a predetermined threshold, said language model weight control unit updates said weight value such that said weight value is decreased.
  - 3. The pronunciation variation rule extraction apparatus according to claim 1, wherein when said difference is larger than a predetermined threshold, said language model weight control unit updates said weight value such that said weight value is increased.
  - 4. The pronunciation variation rule extraction apparatus according to claim 1, wherein said difference extraction unit calculates said difference as an editing distance between said recognition result and said base form pronunciation data.
  - 5. The pronunciation variation rule extraction apparatus according to claim 1, wherein said difference extraction unit extracts as said difference, a pronunciation variation example including a letter string pair of different portions between said recognition result and said base form pronunciation data and the weight value of said sub word language model received from said language model weight control unit by said speech recognition unit at a time of acquisition of said recognition result.
  - 6. The pronunciation variation rule extraction apparatus according to claim 5, further comprising a pronunciation variation probability estimation unit which generates a probability rule of pronunciation variation from said pronunciation variation example.
  - 7. The pronunciation variation rule extraction apparatus according to claim 6, wherein said pronunciation variation probability estimation unit generates said probability rule of said pronunciation variation based on a magnitude of the weight value of said sub word language model upon observation of said pronunciation variation example such that said pronunciation variation example has a high appearance probability.

8. A pronunciation variation rule extraction method comprising:
- storing base form pronunciation data representing base form pronunciation of speech data;
  
  generating a sub word language model from said base form pronunciation data;
  
  recognizing said speech data by using said sub word language model;
  
  extracting a difference between a recognition result of said recognizing and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and
  
  controlling one weight value for said sub word language model,wherein said controlling includes outputting a plurality of said weight values,wherein said recognizing includes recognizing said speech data for each of said plurality of weight values, andwherein said controlling further includes determining whether or not said weight value should be updated, based on said difference when said difference is extracted.
- View Dependent Claims (9, 10, 11, 12)
- - 9. The pronunciation variation rule extraction method according to claim 8, wherein said controlling further includes updating said weight value, when said difference is smaller than a predetermined threshold, such that said weight value is decreased.
  - 10. The pronunciation variation rule extraction method according to claim 8, wherein said controlling further includes updating said weight value, when said difference is larger than a predetermined threshold, such that said weight value is increased.
  - 11. The pronunciation variation rule extraction method according to claim 8, wherein said extracting includes:
    - calculating said difference as an editing distance between said recognition result and said base form pronunciation data; and
      
      extracting as said difference, a pronunciation variation example including a letter string pair of different portions between said recognition result and said base form pronunciation data and said weight value of said sub word language model received upon acquisition of said recognition result.
  - 12. The pronunciation variation rule extraction method according to claim 11, further comprising generating a probability rule of pronunciation variation from said pronunciation variation example,wherein said generating said probability rule includes generating said probability rule of said pronunciation variation based on a magnitude of said weight value of said sub word language model upon observation of said pronunciation variation example, such that said pronunciation variation example has a high appearance probability.

13. A non-transitory computer-readable recording medium which records a pronunciation variation rule extraction program which causes a computer to function as:
- a speech data storage unit which stores speech data;
  
  a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data;
  
  a sub word language model generation unit which generates a sub word language model from said base form pronunciation data;
  
  a speech recognition unit which recognizes said speech data by using said sub word language model;
  
  a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and
  
  a language model weight control unit which controls one weight value for said sub word language model,wherein said language model weight control unit outputs a plurality of said weight values,wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, andwherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted.
- View Dependent Claims (14, 15, 16)
- - 14. The non-transitory computer-readable recording medium according to claim 13, wherein said language model weight control unit updates said weight value such that said weight value is decreased when said difference is smaller than a predetermined threshold.
  - 15. The non-transitory computer-readable recording medium according to claim 13, wherein said language model weight control unit updates said weight value such that said weight value is increased when said difference is larger than a predetermined threshold.
  - 16. The non-transitory computer-readable recording medium according to claim 13, wherein said difference extraction unit calculates said difference as an editing distance between said recognition result and said base form pronunciation data, and extracts as said difference, a pronunciation variation example including letter string pair of different portions between said recognition result and said base form pronunciation data and said weight value of said sub word language model received from said language model weight control unit by said speech recognition unit upon acquisition of said recognition result,wherein said program further causes said computer to function as a pronunciation variation probability estimation unit which generates a probability rule of pronunciation variation from said pronunciation variation example,wherein said pronunciation variation probability estimation unit generates said probability rule of said pronunciation variation based on a magnitude of said weight value of said sub word language model upon observation of said pronunciation variation example, such that said pronunciation variation example has a high appearance probability.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Corporation
Inventors
Koshinaka, Takafumi
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Sirjani, Fariba

Application Number

US12/747,961
Publication Number

US 20100268535A1
Time in Patent Office

1,825 Days
Field of Search

704231-257
US Class Current

704/236
CPC Class Codes

G10L 15/06 Creation of reference templ...

G10L 15/187 Phonemic context, e.g. pron...

Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links