SYSTEM AND METHOD FOR DISTRIBUTED TEXT-TO-SPEECH SYNTHESIS AND INTELLIGIBILITY
First Claim
1. A method for creating an audio index representation of an audio file from text input in a form of a text string and producing the audio file from the audio index representation, the method comprising:
- receiving the text string;
converting the text string to an audio index representation of an audio file associated with the text string at a text-to-speech synthesizer, the converting including selecting at least one audio unit from a first audio unit synthesis inventory having a plurality of audio units, the selected at least one audio unit forming the audio file;
representing the selected at least one audio unit with the audio index representation; and
reproducing the audio file by concatenating the audio units identified in the audio index representation from the first audio unit inventory or a second audio unit synthesis inventory having the audio units identified in the audio index representation.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for distributed text-to-speech synthesis and intelligibility, and more particularly to distributed text-to-speech synthesis on handheld portable computing devices that can be used for example to generate intelligible audio prompts that help a user interact with a user interface of the handheld portable computing device. The text-to-speech distributed system 70 receives a text string from the guest devices and comprises a text analyzer 72, a prosody analyzer 74, a database 14 that the text analyzer and prosody analyzer refer to, and a speech synthesizer 80. Elements of the speech synthesizer 80 are resident on the host device and the guest device and an audio index representation of the audio file associated with the text string is produced at the host device and transmitted to the guest device for producing the audio file at the guest device.
-
Citations
20 Claims
-
1. A method for creating an audio index representation of an audio file from text input in a form of a text string and producing the audio file from the audio index representation, the method comprising:
-
receiving the text string; converting the text string to an audio index representation of an audio file associated with the text string at a text-to-speech synthesizer, the converting including selecting at least one audio unit from a first audio unit synthesis inventory having a plurality of audio units, the selected at least one audio unit forming the audio file; representing the selected at least one audio unit with the audio index representation; and reproducing the audio file by concatenating the audio units identified in the audio index representation from the first audio unit inventory or a second audio unit synthesis inventory having the audio units identified in the audio index representation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for distributed text-to-speech synthesis comprising:
-
receiving text input in a form of a text string at a host device from a separate source; creating an audio index representation of an audio file from the text string on the host device and producing the audio file on a guest device from the audio index representation, the creating of the audio index representation including converting the text string to an audio index representation of an audio file associated with the text string at a text-to-speech synthesizer, the converting including selecting at least one audio unit from a first audio unit synthesis inventory having a plurality of audio units, the selected at least one audio unit forming the audio file;
representing the selected at least one audio unit with the audio index representation; and
producing the audio file from the audio index representation including reproducing the audio file by concatenating the audio units identified in the audio index representation from the first audio unit synthesis inventory or a second audio unit synthesis inventory having the audio units identified in the audio index representation. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system for distributed text-to-speech synthesis comprising:
a guest device configured for sending text input in the form of a text string to a host device for converting the text string to an audio index representation of an audio file associated with the text string, the converting at the host system including selecting at least one audio unit from an audio unit synthesis inventory having a plurality of audio units and wherein the guest device further comprises; a unit-concatenative module and a second inventory of synthesis units, the unit-concatenative module configured for producing the audio file from the audio index representation by concatenating the audio units identified in the audio index representation from the first audio unit synthesis inventory or a second audio unit synthesis inventory having the audio units identified in the audio index representation. - View Dependent Claims (16, 17, 18, 19)
-
20. A host system for creating an audio index representation of an audio file from a text input in a form of text string and producing the audio file from the audio index representation, the method comprising:
a text-to-speech synthesizer for receiving a text string and converting the text string to an audio index representation of an audio file associated with the text string at a text-to-speech synthesizer, the text-to-speech synthesizer comprising a unit-selection unit and an audio unit inventory having a plurality of audio units, the unit-selection unit for selecting at least one audio unit from the audio unit inventory, the selected at least one audio unit forming the audio file, and representing the selected at least one audio unit with the audio index representation, for reproduction of the audio file by concatenating the audio units identified in the audio index representation from the audio unit inventory or another audio unit synthesis inventory having the audio units identified in the audio index representation.
Specification