Apparatus and Method for Creating Singing Synthesizing Database, and Pitch Curve Generation Apparatus and Method
First Claim
1. A singing synthesizing database creation apparatus comprising:
- an input section to which are input learning waveform data representative of sound waveforms of singing voices of a singing music piece and learning score data representative of a musical score of the singing music piece, the learning score data including note data representative of a melody and lyrics data representative of lyrics associated with individual ones of the notes;
a pitch extraction section which analyzes the learning waveform data to generate pitch data indicative of variation over time in fundamental frequency in the singing voices;
a separation section which analyzes the pitch data, for each of pitch data sections corresponding to phonemes constituting the lyrics of the singing music piece, by use of the learning score data and separates the pitch data into melody component data representative of a variation component of the fundamental frequency dependent on the melody of the singing music piece and phoneme-dependent component data representative of a variation component of the fundamental frequency dependent on the phoneme constituting the lyrics;
a first learning section which generates, in association with a combination of notes constituting the melody of the singing music piece, melody component parameters by performing predetermined machine learning using the learning score data and the melody component data, said melody component parameters defining a melody component model that represents a variation component presumed to be representative of the melody among the variation over time in fundamental frequency between notes in the singing voices, and which stores, into a singing synthesizing database, the generated melody component parameters and an identifier, indicative of the combination of notes to be associated with the melody component parameters, in association with each other; and
a second learning section which generates, for each of the phonemes, phoneme-dependent component parameters by performing predetermined machine learning using the learning score data and the phoneme-dependent component data, said phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component of the fundamental frequency dependent on the phoneme in the singing voices, and which stores, into the singing synthesizing database, the generated phoneme-dependent component parameters and a phoneme identifier, indicative of the phoneme to be associated with the phoneme-dependent component parameters, in association with each other.
1 Assignment
0 Petitions
Accused Products
Abstract
Variation over time in fundamental frequency in singing voices is separated into a melody-dependent component and a phoneme-dependent component, modeled for each of the components and stored into a singing synthesizing database. In execution of singing synthesis, a pitch curve indicative of variation over time in fundamental frequency of the melody is synthesized in accordance with an arrangement of notes represented by a singing synthesizing score and the melody-dependent component, and the pitch curve is corrected, for each of pitch curve sections corresponding to phonemes constituting lyrics, using a phoneme-dependent component model corresponding to the phoneme. Such arrangements can accurately model a singing expression, unique to a singing person and appearing in a melody singing style of the person, while taking into account phoneme-dependent pitch variation, and thereby permits synthesis of singing voices that sound more natural.
43 Citations
14 Claims
-
1. A singing synthesizing database creation apparatus comprising:
-
an input section to which are input learning waveform data representative of sound waveforms of singing voices of a singing music piece and learning score data representative of a musical score of the singing music piece, the learning score data including note data representative of a melody and lyrics data representative of lyrics associated with individual ones of the notes; a pitch extraction section which analyzes the learning waveform data to generate pitch data indicative of variation over time in fundamental frequency in the singing voices; a separation section which analyzes the pitch data, for each of pitch data sections corresponding to phonemes constituting the lyrics of the singing music piece, by use of the learning score data and separates the pitch data into melody component data representative of a variation component of the fundamental frequency dependent on the melody of the singing music piece and phoneme-dependent component data representative of a variation component of the fundamental frequency dependent on the phoneme constituting the lyrics; a first learning section which generates, in association with a combination of notes constituting the melody of the singing music piece, melody component parameters by performing predetermined machine learning using the learning score data and the melody component data, said melody component parameters defining a melody component model that represents a variation component presumed to be representative of the melody among the variation over time in fundamental frequency between notes in the singing voices, and which stores, into a singing synthesizing database, the generated melody component parameters and an identifier, indicative of the combination of notes to be associated with the melody component parameters, in association with each other; and a second learning section which generates, for each of the phonemes, phoneme-dependent component parameters by performing predetermined machine learning using the learning score data and the phoneme-dependent component data, said phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component of the fundamental frequency dependent on the phoneme in the singing voices, and which stores, into the singing synthesizing database, the generated phoneme-dependent component parameters and a phoneme identifier, indicative of the phoneme to be associated with the phoneme-dependent component parameters, in association with each other. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A singing synthesizing database creation method comprising:
-
a step of inputting learning waveform data representative of sound waveforms of singing voices of a singing music piece and learning score data representative of a musical score of the singing music piece, the learning score data including note data representative of a melody and lyrics data representative of lyrics associated with individual ones of the notes; a step of analyzing the learning waveform data to generate pitch data indicative of variation over time in fundamental frequency in the singing voices; a step of analyzing the pitch data, for each of pitch data sections corresponding to phonemes constituting the lyrics of the singing music piece, by use of the learning score data and separating the pitch data into melody component data representative of a variation component of the fundamental frequency dependent on the melody of the singing music piece and phoneme-dependent component data representative of a variation component of the fundamental frequency dependent on the phoneme constituting the lyrics; a first learning step of generating, in association with a combination of notes constituting the melody of the singing music piece, melody component parameters by performing predetermined machine learning using the learning score data and the melody component data, said melody component parameters defining a melody component model that represents a variation component presumed to be representative of the melody among the variation over time in fundamental frequency between notes in the singing voices, said first learning step storing, into a singing synthesizing database, the generated melody component parameters and an identifier, indicative of the combination of notes to be associated with the melody component parameters, in association with each other; and a second learning step of generating, for each of the phonemes, phoneme-dependent component parameters by performing predetermined machine learning using the learning score data and the phoneme-dependent component data, said phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component of the fundamental frequency dependent on the phoneme in the singing voices, said second learning step storing, into the singing synthesizing database, the generated phoneme-dependent component parameters and a phoneme identifier, indicative of the phoneme to be associated with the phoneme-dependent component parameters, in association with each other.
-
-
10. A computer-readable storage medium containing a program for causing a computer to perform a singing synthesizing database creation method, said singing synthesizing database creation method:
-
a step of inputting learning waveform data representative of sound waveforms of singing voices of a singing music piece and learning score data representative of a musical score of the singing music piece, the learning score data including note data representative of a melody and lyrics data representative of lyrics associated with individual ones of the notes; a step of analyzing the learning waveform data to generate pitch data indicative of variation over time in fundamental frequency in the singing voices; a step of analyzing the pitch data, for each of pitch data sections corresponding to phonemes constituting the lyrics of the singing music piece, by use of the learning score data and separating the pitch data into melody component data representative of a variation component of the fundamental frequency dependent on the melody of the singing music piece and phoneme-dependent component data representative of a variation component of the fundamental frequency dependent on the phoneme constituting the lyrics; a first learning step of generating, in association with a combination of notes constituting the melody of the singing music piece, melody component parameters by performing predetermined machine learning using the learning score data and the melody component data, said melody component parameters defining a melody component model that represents a variation component presumed to be representative of the melody among the variation over time in fundamental frequency between notes in the singing voices, said first learning step storing, into a singing synthesizing database, the generated melody component parameters and an identifier, indicative of the combination of notes to be associated with the melody component parameters, in association with each other; and a second learning step of generating, for each of the phonemes, phoneme-dependent component parameters by performing predetermined machine learning using the learning score data and the phoneme-dependent component data, said phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component of the fundamental frequency dependent on the phoneme in the singing voices, said second learning step storing, into the singing synthesizing database, the generated phoneme-dependent component parameters and a phoneme identifier, indicative of the phoneme to be associated with the phoneme-dependent component parameters, in association with each other.
-
-
11. A pitch curve generation apparatus comprising:
-
a singing synthesizing database storing therein, separately for each individual one of a plurality of singing persons,
1) melody component parameters defining a melody component model that represents a variation component presumed to be representative of a melody among variation over time in fundamental frequency between notes in singing voices of the singing person, and
2) an identifier indicative of a combination of one or more notes of which fundamental frequency component variation over time is represented by the melody component model, said singing synthesizing database storing therein sets of the melody component parameters and the identifiers in a form classified according to the singing persons, said singing synthesizing database also storing therein, in association with phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component dependent on a phoneme among variation over time in the fundamental frequency, an identifier indicative of the phoneme for which the variation component is represented by the phoneme-dependent component model;an input section to which are input singing synthesizing score data representative of a musical score of a singing music piece and information designating any one of the singing persons for which the melody component parameters are prestored in said singing synthesizing database; a pitch curve generation section which synthesizes a pitch curve of a melody of a singing music piece, represented by the singing synthesizing score data, on the basis of a melody component model defined by the melody component parameters, stored in said singing synthesizing database for the singing person designated by the information inputted via said input section, and a time series of notes represented by the singing synthesizing score data; and a phoneme-dependent component correction section which, for each of pitch curve sections corresponding to phonemes constituting lyrics represented by the singing synthesizing score data, corrects the pitch curve, in accordance with the phoneme-dependent component model defined by the phoneme-dependent component parameters stored for the phoneme in said singing synthesizing database, and outputs the corrected pitch curve. - View Dependent Claims (14)
-
-
12. A method for generating a pitch curve by use of a singing synthesizing database storing therein, separately for each individual one of a plurality of singing persons, 1) melody component parameters defining a melody component model that represents a variation component presumed to be representative of a melody among variation over time in fundamental frequency between notes in singing voices of the singing person, and 2) an identifier indicative of a combination of one or more notes of which fundamental frequency component variation over time is represented by the melody component model, said singing synthesizing database storing therein sets of the melody component parameters and the identifiers in a form classified according to the singing persons, said singing synthesizing database also storing therein, in association with phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component dependent on a phoneme among variation over time in the fundamental frequency, an identifier indicative of the phoneme for which the variation component is represented by the phoneme-dependent component model, said method comprising:
-
a step of inputting singing synthesizing score data representative of a musical score of a singing music piece and information designating any one of the singing persons for which the melody component parameters are prestored in said singing synthesizing database; a step of synthesizing a pitch curve of a melody of a singing music piece, represented by the singing synthesizing score data, on the basis of a melody component model defined by the melody component parameters, stored in said singing synthesizing database for the singing person designated by the information inputted via said input section, and a time series of notes represented by the singing synthesizing score data; and a step of, for each of pitch curve sections corresponding to phonemes constituting lyrics represented by the singing synthesizing score data, correcting the pitch curve, in accordance with the phoneme-dependent component model defined by the phoneme-dependent component parameters stored for the phoneme in said singing synthesizing database, and outputting the corrected pitch curve.
-
-
13. A computer-readable storage medium containing a program for causing a computer to perform a method for generating a pitch curve by use of a singing synthesizing database storing therein, separately for each individual one of a plurality of singing persons, 1) melody component parameters defining a melody component model that represents a variation component presumed to be representative of a melody among variation over time in fundamental frequency between notes in singing voices of the singing person, and 2) an identifier indicative of a combination of one or more notes of which fundamental frequency component variation over time is represented by the melody component model, said singing synthesizing database storing therein sets of the melody component parameters and the identifiers in a form classified according to the singing persons, said singing synthesizing database also storing therein, in association with phoneme-dependent component parameters defining a phoneme-dependent component model that represents a variation component dependent on a phoneme among variation over time in the fundamental frequency, an identifier indicative of the phoneme for which the variation component is represented by the phoneme-dependent component model, said method comprising:
-
a step of inputting singing synthesizing score data representative of a musical score of a singing music piece and information designating any one of the singing persons for which the melody component parameters are prestored in said singing synthesizing database; a step of synthesizing a pitch curve of a melody of a singing music piece, represented by the singing synthesizing score data, on the basis of a melody component model defined by the melody component parameters, stored in said singing synthesizing database for the singing person designated by the information inputted via said input section, and a time series of notes represented by the singing synthesizing score data; and a step of, for each of pitch curve sections corresponding to phonemes constituting lyrics represented by the singing synthesizing score data, correcting the pitch curve, in accordance with the phoneme-dependent component model defined by the phoneme-dependent component parameters stored for the phoneme in said singing synthesizing database, and outputting the corrected pitch curve.
-
Specification