IDENTIFYING SUBSTITUTE PRONUNCIATIONS
First Claim
1. A computer-implemented method comprising:
- selecting one or more terms;
obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms;
receiving audio data corresponding to a particular user speaking the one or more terms in the natural language;
obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language;
aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user;
identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and
based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, including selecting terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the terms; receiving audio data corresponding to a particular user speaking the terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on the aligning, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.
39 Citations
23 Claims
-
1. A computer-implemented method comprising:
-
selecting one or more terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms; receiving audio data corresponding to a particular user speaking the one or more terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; selecting one or more terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms; receiving audio data corresponding to a particular user speaking the one or more terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.
-
23. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
selecting one or more terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the one or more terms; receiving audio data corresponding to a particular user speaking the one or more terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the one or more terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on aligning the expected phonetic transcription of the idealized native speaker with the actual phonetic transcription of the particular user, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic transcription.
-
Specification