Voice to Text to Voice Processing
First Claim
1. A computer storage medium having computer-executable instructions stored thereon that configure the computer to:
- receive a first voice audio signalpreprocess the first voice audio signal into a second voice audio signalextract a first text language representation from the second voice audio signal;
transform the first text language representation into a second text language representation according to a set of language objectives;
transform the second text language representation into a third voice audio signal; and
provide the third voice audio signal as an update to the first voice audio signal.
5 Assignments
0 Petitions
Accused Products
Abstract
Technologies are generally described for voice to text to voice processing. An audio signal can be preprocessed and translated into text prior to being processed in the textual domain. The text domain processing or subsequent text to voice regeneration can seek to improve clarity, correct grammar, adjust vocabulary level, remove profanity, correct slang, alter dialect, alter accent, or provide other modifications of various oral communication characteristics. The processed text may be translated back into the audio domain for delivery to a listener. The processing at each stage may be driven by a set of objectives and constraints set by the speaker, the listener, a third party, or any combination of explicit or implicit participants. The voice processing may translate the voice content from a specific human language to the same human language with various improvements. The processing may also involve translation into one or more other languages.
124 Citations
20 Claims
-
1. A computer storage medium having computer-executable instructions stored thereon that configure the computer to:
-
receive a first voice audio signal preprocess the first voice audio signal into a second voice audio signal extract a first text language representation from the second voice audio signal; transform the first text language representation into a second text language representation according to a set of language objectives; transform the second text language representation into a third voice audio signal; and provide the third voice audio signal as an update to the first voice audio signal. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for voice processing, the method comprising:
-
receiving a first voice audio signal; transforming the first voice audio signal into a first text language representation of the first voice audio signal; transforming the first text language representation into a second text language representation according to a set of language objectives and a set of constraints; and sending the second text language representation to one or more components configured to regenerate a voice signal from the second text language representation. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A voice processing system comprising:
-
a processing unit; a memory for storing an audio signal; and a processing module configured to receive a first voice audio signal, extract a first text language representation from the first voice audio signal, transform the first text language representation into a second text language representation according to a set of language objectives and a set of constraints, and transform the second text language representation into a second voice audio signal. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification