PRE-SAVED DATA COMPRESSION FOR TTS CONCATENATION COST
First Claim
Patent Images
1. A method to be executed at least in part in a computing device for performing concatenative speech synthesis, the method comprising:
- determining feature vectors for speech segments based on a matrix of concatenation costs;
applying distance weighting to each speech segment pair based on the feature vectors;
clustering the speech segments into a predefined number of groups such that an average distance between speech segments within each group is minimized;
selecting a representative speech segment for each group; and
generating a compressed concatenation cost matrix based on the representative speech segments.
2 Assignments
0 Petitions
Accused Products
Abstract
Pre-saved concatenation cost data is compressed through speech segment grouping. Speech segments are assigned to a predefined number of groups based on their concatenation cost values with other speech segments. A representative segment is selected for each group. The concatenation cost between two segments in different groups may then be approximated by that between the representative segments of their respective groups, thereby reducing an amount of concatenation cost data to be pre-saved.
-
Citations
20 Claims
-
1. A method to be executed at least in part in a computing device for performing concatenative speech synthesis, the method comprising:
-
determining feature vectors for speech segments based on a matrix of concatenation costs; applying distance weighting to each speech segment pair based on the feature vectors; clustering the speech segments into a predefined number of groups such that an average distance between speech segments within each group is minimized; selecting a representative speech segment for each group; and generating a compressed concatenation cost matrix based on the representative speech segments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A text to speech (TTS) synthesis system for generating speech employing compressed concatenation cost data, the system comprising:
-
a speech segment data store; an analysis engine; and a speech synthesis engine configured to; determine a feature vector for each speech segment that comprises concatenation cost values of each speech segment with other speech segments; apply distance weighting to each speech segment pair based on their respective feature vectors; cluster the speech segments into a predefined number of groups such that an average distance between speech segments within each group is minimized; select a representative speech segment for each group such that an average distance between the representative speech segment and other speech segments within the same group is minimized; generate a compressed concatenation cost matrix based on the representative speech segments; and pre-save the compressed concatenation cost matrix for real time computations in synthesizing speech. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A computer-readable storage medium with instructions stored thereon for generating speech employing compressed concatenation cost data, the instructions comprising:
-
determining feature vectors for speech segments based on a matrix of concatenation costs constructed along a preceding speech segments axis and a following speech segments axis; applying distance weighting to each speech segment pair based on their respective feature vectors; clustering the speech segments into M preceding segment and N following segment groups such that an average distance between speech segments within each group is minimized; selecting a representative speech segment for each group; generating a compressed concatenation cost matrix such that a concatenation cost between two speech segments is approximated by a concatenation cost between representative segments of respective preceding speech segment and following speech segment groups; and pre-saving the compressed concatenation cost matrix for real time computations in synthesizing speech. - View Dependent Claims (17, 18, 19, 20)
-
Specification