ADAPTIVE TEXT-TO-SPEECH OUTPUTS

US 20170221471A1
Filed: 01/28/2016
Published: 08/03/2017
Est. Priority Date: 01/28/2016
Status: Active Grant

First Claim

Patent Images

1. A method performed by one or more computers, the method comprising:

determining, by the one or more computers, a language proficiency of a user of a client device;

determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises;

selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;

ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment;

generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and

providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.

Citations

27 Claims

1. A method performed by one or more computers, the method comprising:
- determining, by the one or more computers, a language proficiency of a user of a client device;
  
  determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises;
  
  selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;
  
  ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment;
  
  generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and
  
  providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the client device displays a mobile application that uses a text-to-speech interface.
  - 3. The method of claim 1, wherein determining the language proficiency of the user comprises inferring a language proficiency of the user based at least on previous queries submitted by the user.
  - 4. The method of claim 1, wherein determining the text segment for output by the text-to-speech module comprises:
    - identifying multiple text segments as candidates for a text-to-speech output of the user, the multiple text segments having different levels of language complexity; and
      
      selecting from among the multiple text segments based at least on the determined language proficiency of the user of the client device.
  - 5. The method of claim 4, wherein selecting from among the multiple text segments comprises:
    - determining a language complexity score for each of the multiple text segments; and
      
      selecting the text segment having the language complexity score that best matches a reference score that describes the language proficiency of the user of the client device.
  - 6. The method of claim 1, wherein determining the text segment for output by the text-to-speech module comprises:
    - identifying a text segment for a text-to-speech output to the user;
      
      computing a complexity score of the text segment for the text-to-speech output; and
      
      modifying the text segment for the text-to-speech output to the user based at least on the determined language proficiency of the user and the complexity score of the text segment for the text-to-speech output.
  - 7. The method of claim 6, wherein modifying the text segment for the text-to-speech output to the user comprises:
    - determining an overall complexity score for the user based at least on the determining language proficiency of the user;
      
      determining a complexity score for individual portions within the text segment for the text-to-speech output to the user;
      
      identifying one or more individual portions within the text segment with complexity scores greater than the overall complexity score for the user; and
      
      modifying the one or more individual portions within the text segment to reduce complexity scores below the overall complexity score.
  - 8. The method of claim 6, wherein modifying the text segment for the text-to-text-to-speech output to the user comprises:
    - receiving data indicating a context associated with the user;
      
      determining an overall complexity score for the context associated with the user;
      
      determining that the complexity score of the text segment exceeds the overall complexity score for the context associated with the user; and
      
      modifying the text segment to reduce the complexity score below the overall complexity score for the context associated with the user.

9. A system comprising:
- one or more computers; and
  
  a non-transitory computer-readable medium coupled to the one or more computers having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;
  
  determining, by the one or more computers, a language proficiency of a user of a client device;
  
  determining, by the one or more computers, a text segment for output by a text-to-speech module based on the determined language proficiency of the user, wherein determining the text segment comprises;
  
  selecting, from among multiple text segments that each have a language complexity score that indicates a different level of language complexity, the text segment having the language complexity score that best matches a reference score that describes the determined language proficiency of the user of the client device;
  
  ormodifying a particular text segment for the text-to-speech output to the user based at least on (i) the determined language proficiency of the user and (ii) a complexity score of the particular text segment;
  
  generating, by the one or more computers, audio data comprising a synthesized utterance of the text segment; and
  
  providing, by the one or more computers and to the client device, the audio data comprising the synthesized utterance of the text segment.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The system of claim 9, wherein the client device displays a mobile application that uses a text-to-speech interface.
  - 11. The system of claim 9, wherein determining the language proficiency of the user comprises inferring a language proficiency of the user based at least on previous queries submitted by the user.
  - 12. The system of claim 9, wherein determining the text segment for output by the text-to-speech module comprises:
    - identifying multiple text segments as candidates for a text-to-speech output of the user, the multiple text segments having different levels of language complexity; and
      
      selecting from among the multiple text segments based at least on the determined language proficiency of the user of the client device.
  - 13. The system of claim 12, wherein selecting from among the multiple text segments comprises:
    - determining a language complexity score for each of the multiple text segments; and
      
      selecting the text segment having the language complexity score that best matches a reference score that describes the language proficiency of the user of the client device.
  - 14. The system of claim 9, wherein determining the text segment for output by the text-to-speech module comprises:
    - identifying a text segment for a text-to-speech output to the user;
      
      computing a complexity score of the text segment for the text-to-speech output; and
      
      modifying the text segment for the text-to-speech output to the user based at least on the determined language proficiency of the user and the complexity score of the text segment for the text-to-speech output.
  - 15. The system of claim 14, wherein modifying the text segment for the text-to-speech output to the user comprises:
    - determining an overall complexity score for the user based at least on the determining language proficiency of the user;
      
      determining a complexity score for individual portions within the text segment for the text-to-speech output to the user;
      
      identifying one or more individual portions within the text segment with complexity scores greater than the overall complexity score for the user; and
      
      modifying the one or more individual portions within the text segment to reduce complexity scores below the overall complexity score.

16. A method performed by one or more computers, the method comprising:
- receiving data indicating a context associated with the user;
  
  determining an overall complexity score for the context associated with the user;
  
  identifying a text segment for a text-to-speech output to the user;
  
  determining that a complexity score of the text segment exceeds the overall complexity score for the context associated with the user; and
  
  modifying the text segment to reduce the complexity score below the overall complexity score for the context associated with the user.
- View Dependent Claims (17, 18, 19, 20, 21, 23)
- - 17. The method of claim 16, wherein determining the overall complexity score for the context associated with the user comprises:
    - identifying terms included within previously submitted queries by the user when the user was determined to be in the context; and
      
      determining an overall complexity score for the context associated with the user based at least on the identified terms.
  - 18. The method of claim 16, wherein the data indicating the context associated with the user includes queries that were previously submitted by the user.
  - 19. The method of claim 16, wherein the data indicating the context associated with the user includes a GPS signal indicating a current location associated with the user.
  - 20. The method of claim 16, wherein data indicating the context associated with the user includes a sensor data from a mobile device of the user.
  - 21. The method of claim 16, further comprising providing, for output to the user, audio data comprising a synthesized utterance of the modified text segment.
  - 23. The system of claim 21, wherein determining the overall complexity score for the context associated with the user comprises:
    - identifying terms included within previously submitted queries by the user when the user was determined to be in the context; and
      
      determining an overall complexity score for the context associated with the user based at least on the identified terms.

22. A system comprising:
- one or more computers; and
  
  a non-transitory computer-readable medium coupled to the one or more computers having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising;
  
  receiving data indicating a context associated with the user;
  
  determining an overall complexity score for the context associated with the user;
  
  identifying a text segment for a text-to-speech output to the user;
  
  determining that a complexity score of the text segment exceeds the overall complexity score for the context associated with the user; and
  
  modifying the text segment to reduce the complexity score below the overall complexity score for the context associated with the user.
- View Dependent Claims (24, 25, 26, 27)
- - 24. The system of claim 22, wherein the data indicating the context associated with the user includes queries that were previously submitted by the user.
  - 25. The system of claim 22, wherein the data indicating the context associated with the user includes a GPS signal indicating a current location associated with the user.
  - 26. The system of claim 22, wherein data indicating the context associated with the user includes a sensor data from a mobile device of the user.
  - 27. The method of claim 22, wherein the operations further comprise providing, for output to the user, audio data comprising a synthesized utterance of the modified text segment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Foerster, Jakob Nicolaus, Sharifi, Matthew

Granted Patent

US 9,799,324 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/253   Grammatical analysis; Style...

G06F 40/289   Phrasal analysis, e.g. fini...

G10L 13/00   Speech synthesis; Text to s...

G10L 13/08   Text analysis or generation...

ADAPTIVE TEXT-TO-SPEECH OUTPUTS

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

ADAPTIVE TEXT-TO-SPEECH OUTPUTS

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links