System and method for user-specified pronunciation of words for speech synthesis and recognition
First Claim
1. A method for learning word pronunciations, comprising:
- at an electronic device with one or more processors and memory storing one or more programs for execution by the one or more processors;
receiving a first speech input including at least one word;
determining a first phonetic representation of the at least one word, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet;
mapping the first set of phonemes to a second set of phonemes to generate a second phonetic representation, the second set of phonemes selected from a speech synthesis phonetic alphabet that is different from the speech recognition phonetic alphabet, wherein the speech recognition phonetic alphabet and the speech synthesis phonetic alphabet are phonetic alphabets of a same language; and
storing the second phonetic representation in association with a text string corresponding to the at least one word.
1 Assignment
0 Petitions
Accused Products
Abstract
The method is performed at an electronic device with one or more processors and memory storing one or more programs for execution by the one or more processors. A first speech input including at least one word is received. A first phonetic representation of the at least one word is determined, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet. The first set of phonemes is mapped to a second set of phonemes to generate a second phonetic representation, where the second set of phonemes is selected from a speech synthesis phonetic alphabet. The second phonetic representation is stored in association with a text string corresponding to the at least one word.
3826 Citations
23 Claims
-
1. A method for learning word pronunciations, comprising:
-
at an electronic device with one or more processors and memory storing one or more programs for execution by the one or more processors; receiving a first speech input including at least one word; determining a first phonetic representation of the at least one word, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet; mapping the first set of phonemes to a second set of phonemes to generate a second phonetic representation, the second set of phonemes selected from a speech synthesis phonetic alphabet that is different from the speech recognition phonetic alphabet, wherein the speech recognition phonetic alphabet and the speech synthesis phonetic alphabet are phonetic alphabets of a same language; and storing the second phonetic representation in association with a text string corresponding to the at least one word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by an electronic device with a display, cause the device to perform:
-
receiving a first speech input including at least one word; determining a first phonetic representation of the at least one word, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet; mapping the first set of phonemes to a second set of phonemes to generate a second phonetic representation, the second set of phonemes selected from a speech synthesis phonetic alphabet that is different from the speech recognition phonetic alphabet, wherein the speech recognition phonetic alphabet and the speech synthesis phonetic alphabet are phonetic alphabets of a same language; and storing the second phonetic representation in association with a text string corresponding to the at least one word. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. An electronic device, comprising:
-
one or more processors; memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing; receiving a first speech input including at least one word; determining a first phonetic representation of the at least one word, the first phonetic representation comprising a first set of phonemes selected from a speech recognition phonetic alphabet; mapping the first set of phonemes to a second set of phonemes to generate a second phonetic representation, the second set of phonemes selected from a speech synthesis phonetic alphabet that is different from the speech recognition phonetic alphabet, wherein the speech recognition phonetic alphabet and the speech synthesis phonetic alphabet are phonetic alphabets of a same language; and storing the second phonetic representation in association with a text string corresponding to the at least one word. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
Specification