Domain adaptation for TTS systems
First Claim
1. A method adapting a text-to-speech system, the method comprising:
- supplying a domain-specific text corpus that corresponds to a target domain;
supplying a plurality of scripts that correspond to an inventory of speech units utilized by the text-to-speech system to synthesize speech;
generating a list of candidate domain-specific strings using text from the domain-specific text corpus, wherein each candidate domain-specific string occurs at least a predetermined number of times within the domain-specific text corpus, wherein the predetermined number is more than once;
generating a domain-specific script using said domain-specific string so as to include at least one domain-specific string included in the list of candidate domain-specific strings; and
adapting the text-to-speech system based on the domain-specific script so as to improve the perceived naturalness of synthesized speech.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of the present invention pertain to adaptation of a corpus-driven general-purpose TTS system to at least one specific domain. The domain adaptation is realized by adding a limited amount of domain-specific speech that provides a maximum impact on improved perceived naturalness of speech. An approach for generating optimized script for adaptation is proposed, the core of which is a dynamic programming based algorithm that segments domain-specific corpus into a minimum number of segments that appear in the unit inventory. Increases in perceived naturalness of speech after adaptation are estimated from the generated script without recording speech from it.
32 Citations
22 Claims
-
1. A method adapting a text-to-speech system, the method comprising:
-
supplying a domain-specific text corpus that corresponds to a target domain; supplying a plurality of scripts that correspond to an inventory of speech units utilized by the text-to-speech system to synthesize speech; generating a list of candidate domain-specific strings using text from the domain-specific text corpus, wherein each candidate domain-specific string occurs at least a predetermined number of times within the domain-specific text corpus, wherein the predetermined number is more than once; generating a domain-specific script using said domain-specific string so as to include at least one domain-specific string included in the list of candidate domain-specific strings; and adapting the text-to-speech system based on the domain-specific script so as to improve the perceived naturalness of synthesized speech. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
2. The method of 1, wherein each candidate domain-specific string does not occur within the plurality of scripts that correspond to the inventory of speech units.
-
19. A method for generating a domain-specific script for domain adaptation of a text-to-speech system, the method comprising:
-
supplying a domain-specific text corpus that corresponds to a target domain; supplying a plurality of scripts that correspond to an inventory of speech units utilized by the text-to-speech system to synthesize speech; generating a list of candidate domain-specific strings using text from the domain-specific text corpus, wherein each candidate domain-specific string occurs a predetermined number of times within the domain-specific text corpus, where in the predetermined number of times is more than once; selecting from the list, based on an objective criteria, a limited number of candidate domain-specific strings that, if added to the plurality of scripts that correspond to the inventory of speech units, will the naturalness of synthetic speech produced by the text-to-speech system on the target domain; generating a domain-specific script so as to include the limited number of candidate domain-specific strings; and adapting the text-to-speech system based on the domain-specific script so as to improve the perceived naturalness of synthesized speech. - View Dependent Claims (20, 21, 22)
-
Specification