PREDICTING PRONUNCIATIONS WITH WORD STRESS
First Claim
1. A method performed by one or more computers, the method comprising:
- determining, by the one or more computers, spelling data that indicates the spelling of a word;
providing, by the one or more computers, the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words;
receiving, by the one or more computers, output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input;
using, by the one or more computers, the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word; and
providing, by the one or more computers, the pronunciation data to a text-to-speech system or an automatic speech recognition system.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating word pronunciations. One of the methods includes determining, by one or more computers, spelling data that indicates the spelling of a word, providing the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words, receiving output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input, using the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word, and providing, by the one or more computers, the pronunciation data to a text-to-speech system or an automatic speech recognition system.
-
Citations
20 Claims
-
1. A method performed by one or more computers, the method comprising:
-
determining, by the one or more computers, spelling data that indicates the spelling of a word; providing, by the one or more computers, the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words; receiving, by the one or more computers, output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input; using, by the one or more computers, the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word; and providing, by the one or more computers, the pronunciation data to a text-to-speech system or an automatic speech recognition system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a data processing apparatus; and a non-transitory computer readable storage medium in data communication with the data processing apparatus and storing instructions executable by the data processing apparatus and upon such execution cause the data processing apparatus to perform operations comprising; determining spelling data that indicates the spelling of a word; providing the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words; receiving output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input; using the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word; and providing the pronunciation data to a text-to-speech system or an automatic speech recognition system. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer readable storage medium storing instructions executable by a data processing apparatus and upon such execution cause the data processing apparatus to perform operations comprising:
-
determining spelling data that indicates the spelling of a word; providing the spelling data as input to a trained recurrent neural network, the trained recurrent neural network being trained to indicate characteristics of word pronunciations based at least on data indicating the spelling of words; receiving output indicating a stress pattern for pronunciation of the word generated by the trained recurrent neural network in response to providing the spelling data as input; using the output of the trained recurrent neural network to generate pronunciation data indicating the stress pattern for a pronunciation of the word; and providing the pronunciation data to a text-to-speech system or an automatic speech recognition system.
-
Specification