Method and apparatus for improved speech recognition by modifying a pronunciation dictionary based on pattern definitions of alternate word pronunciations
First Claim
1. A method for modifying a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations, wherein each pronunciation in the pronunciation dictionary is represented as one or more dynamically linked phoneme values, the method comprising the computer-implemented steps of:
- receiving one or more phoneme substitution patterns that define one or more substitute phoneme values for one or more of the dynamically linked phoneme values;
adding the substitute phoneme values and new links therefor to the dynamically linked phoneme values;
determining one or more best paths among the dynamically linked phoneme values for a particular word under consideration; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths.
5 Assignments
0 Petitions
Accused Products
Abstract
An approach for automatically modifying a pronunciation dictionary in a speech recognition system based on patterns of alternate pronunciations is described. A representation of the pronunciation dictionary, such as a plurality of dynamically linked phoneme values, is obtained. One or more pattern definitions are obtained. The pattern definitions specify zero or more phonemes to be substituted for zero or more phonemes of all words in the pronunciation dictionary. The linked phoneme values are modified by adding, for each path of each word, alternate paths that use each of the substitute phonemes according to the pattern definitions, thereby creating an expanded set of dynamically linked phoneme values. One or more example pronunciations of a particular word are then obtained. One or more best paths through the expanded set of phoneme values are determined for each of the example pronunciations and used to find the overall best path(s). For the overall best path(s), an alternate word pronunciation is constructed by converting each path into a pronunciation using the format of the pronunciation dictionary. The pronunciation dictionary is modified by adding each alternate word pronunciation. As a result, a modified pronunciation dictionary is created that accounts for alternate pronunciations as actually spoken by users of a particular speech recognition application.
-
Citations
20 Claims
-
1. A method for modifying a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations, wherein each pronunciation in the pronunciation dictionary is represented as one or more dynamically linked phoneme values, the method comprising the computer-implemented steps of:
-
receiving one or more phoneme substitution patterns that define one or more substitute phoneme values for one or more of the dynamically linked phoneme values;
adding the substitute phoneme values and new links therefor to the dynamically linked phoneme values;
determining one or more best paths among the dynamically linked phoneme values for a particular word under consideration; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for modifying a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations, wherein each pronunciation in the pronunciation dictionary is represented during use of the speech recognition system as one or more paths among nodes in a dynamically linked phoneme network, wherein phonemes in each pronunciation are represented by arcs between nodes in the network, the method comprising the computer-implemented steps of:
-
receiving one or more phoneme substitution patterns that define one or more substitute values for one or more of the phonemes in the network;
adding the substitute values and new arcs therefor to the network;
determining one or more best paths through the network for a particular word under consideration; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method for modifying a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations, wherein each pronunciation in the pronunciation dictionary is represented during use of the speech recognition system as one or more paths among nodes in a dynamically linked phoneme network, wherein phonemes in each pronunciation are represented by arcs between nodes in the network, the method comprising the computer-implemented steps of:
-
receiving one or more phoneme substitution patterns that define one or more substitute values for one or more of the phonemes in the network;
adding the substitute values and new arcs therefor to the network;
receiving a plurality of example pronunciations for a particular word under consideration;
determining one or more best paths through the network for the word under consideration by carrying out recognition of the example pronunciations using the network; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths.
-
-
14. A computer-readable medium carrying one or more sequences of one or more instructions for modifying a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations, wherein each pronunciation in the pronunciation dictionary is represented as one or more dynamically linked phoneme values, the one or more sequences of one or more instructions including instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of:
-
receiving one or more phoneme substitution patterns that define one or more substitute phoneme values for one or more of the dynamically linked phoneme values;
adding the substitute phoneme values and new links therefor to the dynamically linked phoneme values;
determining one or more best paths among the dynamically linked phoneme values for a particular word under consideration; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A speech recognition apparatus comprising:
-
a storage medium having a pronunciation dictionary stored thereon, wherein each pronunciation in the pronunciation dictionary is represented as one or more dynamically linked phoneme values; and
an expansion mechanism and a best path mechanism communicatively coupled to the storage medium, wherein the expansion mechanism and the best path mechanism are configured to modify a pronunciation dictionary in a speech recognition system to include one or more alternate pronunciations by;
receiving one or more phoneme substitution patterns that define one or more substitute phoneme values for one or more of the dynamically linked phoneme values;
adding the substitute phoneme values and new links therefor to the dynamically linked phoneme values;
determining one or more best paths among the dynamically linked phoneme values for a particular word under consideration; and
modifying the pronunciation dictionary by adding a pronunciation that represents each of the best paths.
-
Specification