Automated Generation of Audiobook with Multiple Voices and Sounds from Text
First Claim
1. A computer-implementable method for transcoding text to speech and audio, comprising:
- parsing input text with a natural language processor to automatically;
identify spoken text passages and sound description passages;
determine a speaker identity for each spoken text passage and a sound element for each sound description passage;
determine speaker attributes of each speaker identity and sound attributes of each sound element, each speaker identity and sound element automatically referenced to a voice and sound effects schema;
associate a voice effect with each speaker identity and a sound effect with each sound element, the voice effect and sound effects automatically selected from repository of voice and sound effects; and
annotating each spoken text passage with the voice effect associated with its speaker identity and each sound description passage with the sound effect associated with its sound element.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, system and computer-usable medium are disclosed for the transcoding of annotated text to speech and audio. Source text is parsed into spoken text passages and sound description passages. A speaker identity is determined for each spoken text passage and a sound element for each sound description passage. The speaker identities and sound elements are automatically referenced to a voice and sound effects schema. A voice effect is associated with each speaker identity and a sound effect with each sound element. Each spoken text passage is then annotated with the voice effect associated with its speaker identity and each sound description passage is annotated with the sound effect associated with its sound element. The resulting annotated spoken text and sound description passages are processed to generate output text operable to be transcoded to speech and audio.
46 Citations
20 Claims
-
1. A computer-implementable method for transcoding text to speech and audio, comprising:
-
parsing input text with a natural language processor to automatically; identify spoken text passages and sound description passages; determine a speaker identity for each spoken text passage and a sound element for each sound description passage; determine speaker attributes of each speaker identity and sound attributes of each sound element, each speaker identity and sound element automatically referenced to a voice and sound effects schema; associate a voice effect with each speaker identity and a sound effect with each sound element, the voice effect and sound effects automatically selected from repository of voice and sound effects; and annotating each spoken text passage with the voice effect associated with its speaker identity and each sound description passage with the sound effect associated with its sound element. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
a processor; a data bus coupled to the processor; and a computer-usable medium embodying computer program code, the computer-usable medium being coupled to the data bus, the computer program code transcoding text to speech and audio and comprising instructions executable by the processor and configured for; parsing input text with a natural language processor to automatically; identify spoken text passages and sound description passages; determine a speaker identity for each spoken text passage and a sound element for each sound description passage; determine speaker attributes of each speaker identity and sound attributes of each sound element, each speaker identity and sound element automatically referenced to a voice and sound effects schema; associate a voice effect with each speaker identity and a sound effect with each sound element, the voice and sound effects automatically selected from repository of voice and sound effects; and annotating each spoken text passage with the voice effect associated with its speaker identity and each sound description passage with the sound effect associated with its sound element. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-usable medium embodying computer program code, the computer program code comprising computer executable instructions configured for:
-
parsing input text with a natural language processor to automatically; identify spoken text passages and sound description passages; determine a speaker identity for each spoken text passage and a sound element for each sound description passage; determine speaker attributes of each speaker identity and sound attributes of each sound element, each speaker identity and sound element automatically referenced to a voice and sound effects schema; associate a voice effect with each speaker identity and a sound effect with each sound element, the voice and sound effects automatically selected from repository of voice and sound effects; and annotating each spoken text passage with the voice effect associated with its speaker identity and each sound description passage with the sound effect associated with its sound element. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification