ADAPTIVE TEXT-TO-SPEECH OUTPUTS
First Claim
Patent Images
1. A method performed by one or more computers, the method comprising:
- determining, by the one or more computers, a language proficiency of a user of a client device;
determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises;
selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;
ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment;
generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and
providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment.
2 Assignments
0 Petitions
Accused Products
Abstract
In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
-
Citations
27 Claims
-
1. A method performed by one or more computers, the method comprising:
-
determining, by the one or more computers, a language proficiency of a user of a client device; determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises; selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;
ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment; generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
-
one or more computers; and a non-transitory computer-readable medium coupled to the one or more computers having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; determining, by the one or more computers, a language proficiency of a user of a client device; determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises; selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;
ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment; generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method performed by one or more computers, the method comprising:
-
receiving data indicating a context associated with the user; determining an overall complexity score for the context associated with the user; identifying a text segment for a text-to-speech output to the user; determining that a complexity score of the text segment exceeds the overall complexity score for the context associated with the user; and modifying the text segment to reduce the complexity score below the overall complexity score for the context associated with the user. - View Dependent Claims (17, 18, 19, 20, 21, 23)
-
-
22. A system comprising:
-
one or more computers; and a non-transitory computer-readable medium coupled to the one or more computers having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving data indicating a context associated with the user; determining an overall complexity score for the context associated with the user; identifying a text segment for a text-to-speech output to the user; determining that a complexity score of the text segment exceeds the overall complexity score for the context associated with the user; and modifying the text segment to reduce the complexity score below the overall complexity score for the context associated with the user. - View Dependent Claims (24, 25, 26, 27)
-
Specification