System and method for synthetically generated speech describing media content
First Claim
1. A method comprising:
- accessing, by a system including a processor, user input that was captured during a presentation of media content, wherein the user input comprises a metadata request associated with the media content;
obtaining, by the system according to the metadata request, metadata for output; and
outputting, by the system, synthetically generated speech associated with the media content during an audio gap in the presentation of the media content, the synthetically generated speech having an accent selected from a plurality of accents based on the metadata.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and computer readable-media for providing an automatic synthetically generated voice describing media content, the method comprising receiving one or more pieces of metadata for a primary media content, selecting at least one piece of metadata for output, and outputting the at least one piece of metadata as synthetically generated speech with the primary media content. Other aspects of the invention involve alternative output, output speech simultaneously with the primary media content, output speech during gaps in the primary media content, translate metadata in foreign language, tailor voice, accent, and language to match the metadata and/or primary media content. A user may control output via a user interface or output may be customized based on preferences in a user profile.
-
Citations
20 Claims
-
1. A method comprising:
-
accessing, by a system including a processor, user input that was captured during a presentation of media content, wherein the user input comprises a metadata request associated with the media content; obtaining, by the system according to the metadata request, metadata for output; and outputting, by the system, synthetically generated speech associated with the media content during an audio gap in the presentation of the media content, the synthetically generated speech having an accent selected from a plurality of accents based on the metadata. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content; accessing a group of metadata from a source that is different from the source of the media content; selecting, according to the metadata request, metadata from the group of metadata; and outputting synthetically generated speech according to the metadata during an audio gap in the presentation of the media content, the synthetically generated speech having an accent, a language, a lexicon, or a combination thereof selected from a plurality of accents, languages, lexicons or combinations thereof. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving a gesture from a user during a presentation of media content, wherein the gesture comprises a metadata request associated with the media content; selecting, according to the metadata request, metadata from a group of metadata; accessing a user profile for the user; and outputting synthetically generated speech according to the metadata and the user profile during an audio gap in the presentation of the media content, the synthetically generated speech having an accent, a language, a lexicon, or a combination thereof selected from a plurality of accents, languages, lexicons or combinations thereof. - View Dependent Claims (18, 19, 20)
-
Specification