Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program
First Claim
1. A pronunciation variation rule extraction apparatus comprising:
- a speech data storage unit which stores speech data;
a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data;
a sub word language model generation unit which generates a sub word language model from said base form pronunciation data;
a speech recognition unit which recognizes said speech data by using said sub word language model;
a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and
a language model weight control unit which controls one weight value for said sub word language model,wherein said language model weight control unit outputs a plurality of said weight values,wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, andwherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted.
1 Assignment
0 Petitions
Accused Products
Abstract
A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model. The difference extraction unit extracts a difference between a recognition result outputted from the speech recognition unit and the base form pronunciation data by comparing the recognition result and the base form pronunciation data.
-
Citations
16 Claims
-
1. A pronunciation variation rule extraction apparatus comprising:
-
a speech data storage unit which stores speech data; a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data; a sub word language model generation unit which generates a sub word language model from said base form pronunciation data; a speech recognition unit which recognizes said speech data by using said sub word language model; a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and a language model weight control unit which controls one weight value for said sub word language model, wherein said language model weight control unit outputs a plurality of said weight values, wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, and wherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A pronunciation variation rule extraction method comprising:
-
storing base form pronunciation data representing base form pronunciation of speech data; generating a sub word language model from said base form pronunciation data; recognizing said speech data by using said sub word language model; extracting a difference between a recognition result of said recognizing and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and controlling one weight value for said sub word language model, wherein said controlling includes outputting a plurality of said weight values, wherein said recognizing includes recognizing said speech data for each of said plurality of weight values, and wherein said controlling further includes determining whether or not said weight value should be updated, based on said difference when said difference is extracted. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A non-transitory computer-readable recording medium which records a pronunciation variation rule extraction program which causes a computer to function as:
-
a speech data storage unit which stores speech data; a base form pronunciation storage unit which stores base form pronunciation data representing base form pronunciation of said speech data; a sub word language model generation unit which generates a sub word language model from said base form pronunciation data; a speech recognition unit which recognizes said speech data by using said sub word language model; a difference extraction unit which extracts a difference between a recognition result outputted from said speech recognition unit and said base form pronunciation data by comparing said recognition result and said base form pronunciation data; and a language model weight control unit which controls one weight value for said sub word language model, wherein said language model weight control unit outputs a plurality of said weight values, wherein said speech recognition unit recognizes said speech data for each of said plurality of weight values, and wherein said language model weight control unit determines whether or not said weight value should be updated, based on said difference when said difference is extracted. - View Dependent Claims (14, 15, 16)
-
Specification