System and method for synthetically generated speech describing media content

US 9,324,317 B2
Filed: 09/09/2014
Issued: 04/26/2016
Est. Priority Date: 06/06/2008
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content;

selecting a piece of metadata for output, to yield selected metadata, the selected metadata being responsive to the metadata request regarding the primary media content; and

outputting the selected metadata as synthetically generated speech, the synthetically generated speech having an accent selected from a plurality of accents based on the selected metadata.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and computer readable-media for providing an automatic synthetically generated voice describing media content, the method comprising receiving one or more pieces of metadata for a primary media content, selecting at least one piece of metadata for output, and outputting the at least one piece of metadata as synthetically generated speech with the primary media content. Other aspects of the invention involve alternative output, output speech simultaneously with the primary media content, output speech during gaps in the primary media content, translate metadata in foreign language, tailor voice, accent, and language to match the metadata and/or primary media content. A user may control output via a user interface or output may be customized based on preferences in a user profile.

15 Citations

20 Claims

1. A method comprising:
- receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content;
  
  selecting a piece of metadata for output, to yield selected metadata, the selected metadata being responsive to the metadata request regarding the primary media content; and
  
  outputting the selected metadata as synthetically generated speech, the synthetically generated speech having an accent selected from a plurality of accents based on the selected metadata.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the synthetically generated speech is output during the presentation of the media content.
  - 3. The method of claim 1, further comprising analyzing the media content to determine tone and prosody which indicate the accent.
  - 4. The method of claim 1, wherein the synthetically generated speech is output during gaps in the presentation of the media content.
  - 5. The method of claim 1, further comprising:
    - determining the metadata is in a foreign language, where the accent corresponds to the foreign language; and
      
      translating the metadata to another language from the foreign language before output.
  - 6. The method of claim 1, wherein the gesture is accompanied by an oral command.
  - 7. The method of claim 1, wherein the metadata is output via a distinct output from that of the presentation of media content.

8. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content;
  
  selecting a piece of metadata for output, to yield selected metadata, the selected metadata being responsive to the metadata request regarding the primary media content; and
  
  outputting the selected metadata as synthetically generated speech, the synthetically generated speech having an accent selected from a plurality of accents based on the selected metadata.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the synthetically generated speech is output during the presentation of the media content.
  - 10. The system of claim 8, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising analyzing the media content to determine tone and prosody which indicate the accent.
  - 11. The system of claim 8, wherein the synthetically generated speech is output during gaps in the presentation of the media content.
  - 12. The system of claim 8, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising:
    - determining the metadata is in a foreign language, where the accent corresponds to the foreign language; and
      
      translating the metadata to another language from the foreign language before output.
  - 13. The system of claim 8, wherein the gesture is accompanied by an oral command.
  - 14. The system of claim 8, wherein the metadata is output via a distinct output from that of the presentation of media content.

15. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content;
  
  selecting a piece of metadata for output, to yield selected metadata, the selected metadata being responsive to the metadata request regarding the primary media content; and
  
  outputting the selected metadata as synthetically generated speech, the synthetically generated speech having an accent selected from a plurality of accents based on the selected metadata.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable storage device of claim 15, wherein the synthetically generated speech is output during the presentation of the media content.
  - 17. The computer-readable storage device of claim 15, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising analyzing the media content to determine tone and prosody which indicate the accent.
  - 18. The computer-readable storage device of claim 15, wherein the synthetically generated speech is output during gaps in the presentation of the media content.
  - 19. The computer-readable storage device of claim 15, having additional instructions stored which, when executed by the computing device, result in operations comprising:
    - determining the metadata is in a foreign language, where the accent corresponds to the foreign language; and
      
      translating the metadata to another language from the foreign language before output.
  - 20. The computer-readable storage device of claim 15, wherein the gesture is accompanied by an oral command.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Roberts, Linda, Nguyen, Hong Thi, Schroeter, Horst J.
Primary Examiner(s)
PULLIAS, JESSE SCOTT

Application Number

US14/481,326
Publication Number

US 20140379350A1
Time in Patent Office

595 Days
Field of Search

704231-257, 704270-275
US Class Current

1/1
CPC Class Codes

G06F 3/017   Gesture based interaction, ...

G06F 3/04842   Selection of displayed obje...

G06F 3/167   Audio in a user interface, ...

G06F 40/58   Use of machine translation,...

G10L 13/00   Speech synthesis; Text to s...

G10L 13/033   Voice editing, e.g. manipul...

G10L 13/04   Details of speech synthesis...

G10L 13/086   Detection of language

G10L 13/10   Prosody rules derived from ...

G10L 15/22   Procedures used during a sp...

G10L 2013/083   Special characters, e.g. pu...

System and method for synthetically generated speech describing media content

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

15 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

System and method for synthetically generated speech describing media content

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others