Speech information processing apparatus and method
First Claim
Patent Images
1. An apparatus for processing speech information, comprising:
- a first generation unit configured to generate a temporary child set to which at least one fundamental frequency pattern belongs by classifying a plurality of fundamental frequency patterns of inputted speech data based on a classification item of a context of the inputted speech data;
a first decision unit configured to decide a length of a temporary typical pattern of the temporary child set;
a model pattern setting unit configured to set a model pattern having an elastic section along a temporal direction;
a calculation unit configured to calculate an elastic ratio of the elastic section so that the length of the temporal typical pattern coincides with a length of the model pattern;
an elastic unit configured to expand or contract the elastic section of the model pattern based on the elastic ratio;
a second generation unit configured to generate the temporary typical pattern of the temporary child set by combining the fundamental frequency pattern belonging to the temporary child set with the model pattern having an elastic pattern expanded or contracted;
a second decision unit configured to calculate a distortion between the temporary typical pattern of the temporary child set and the fundamental frequency pattern belonging to the temporary child set, and to decide a child set as the temporary child set when the distortion is below a threshold;
a pattern storage unit configured to store a typical pattern as the temporary typical pattern of the child set; and
a rule storage unit configured to store a classification rule of the typical pattern as the classification item of the context of the fundamental frequency pattern belonging to the child set.
1 Assignment
0 Petitions
Accused Products
Abstract
A temporary child set is generated. An elastic ratio of an elastic section of a model pattern is calculated. A temporary typical pattern of the set is generated by combining the pattern belonging to the set with the model pattern having the elastic pattern expanded or contracted. A distortion between the temporary typical pattern of the set and the pattern belonging to the set is calculated, and a child set is determined as the set when the distortion is below a threshold. A typical pattern as the temporary typical pattern of the child set is stored with a classification rule as the classification item of the context of the pattern belonging to the child set.
7 Citations
14 Claims
-
1. An apparatus for processing speech information, comprising:
-
a first generation unit configured to generate a temporary child set to which at least one fundamental frequency pattern belongs by classifying a plurality of fundamental frequency patterns of inputted speech data based on a classification item of a context of the inputted speech data; a first decision unit configured to decide a length of a temporary typical pattern of the temporary child set; a model pattern setting unit configured to set a model pattern having an elastic section along a temporal direction; a calculation unit configured to calculate an elastic ratio of the elastic section so that the length of the temporal typical pattern coincides with a length of the model pattern; an elastic unit configured to expand or contract the elastic section of the model pattern based on the elastic ratio; a second generation unit configured to generate the temporary typical pattern of the temporary child set by combining the fundamental frequency pattern belonging to the temporary child set with the model pattern having an elastic pattern expanded or contracted; a second decision unit configured to calculate a distortion between the temporary typical pattern of the temporary child set and the fundamental frequency pattern belonging to the temporary child set, and to decide a child set as the temporary child set when the distortion is below a threshold; a pattern storage unit configured to store a typical pattern as the temporary typical pattern of the child set; and a rule storage unit configured to store a classification rule of the typical pattern as the classification item of the context of the fundamental frequency pattern belonging to the child set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for processing speech information, comprising:
-
generating a temporary child set to which at least one fundamental frequency pattern belongs by classifying a plurality of fundamental frequency patterns of inputted speech data based on a classification item of a context of the inputted speech data; deciding a length of a temporary typical pattern of the temporary child set; setting a model pattern having an elastic section along a temporal direction; calculating an elastic ratio of the elastic section so that the length of the temporal typical pattern coincides with a length of the model pattern; expanding or contracting the elastic section of the model pattern based on the elastic ratio; generating the temporary typical pattern of the temporary child set by combining the fundamental frequency pattern belonging to the temporary child set with the model pattern having an elastic pattern expanded or contracted; calculating a distortion between the temporary typical pattern of the temporary child set and the fundamental frequency pattern belonging to the temporary child set; deciding a child set as the temporary child set when the distortion is below a threshold; storing a typical pattern as the temporary typical pattern of the child set; and storing a classification rule of the typical pattern as the classification item of the context of the fundamental frequency pattern belonging to the child set.
-
-
14. A non-transitory computer readable medium that stores computer executable instructions for causing a computer to perform a method for processing speech information, the method comprising:
-
generating a temporary child set to which at least one fundamental frequency pattern belongs by classifying a plurality of fundamental frequency patterns of inputted speech data based on a classification item of a context of the inputted speech data; deciding a length of a temporary typical pattern of the temporary child set; setting a model pattern having an elastic section along a temporal direction; calculating an elastic ratio of the elastic section so that the length of the temporal typical pattern coincides with a length of the model pattern; expanding or contracting the elastic section of the model pattern based on the elastic ratio; generating the temporary typical pattern of the temporary child set by combining the fundamental frequency pattern belonging to the temporary child set with the model pattern having an elastic pattern expanded or contracted; calculating a distortion between the temporary typical pattern of the temporary child set and the fundamental frequency pattern belonging to the temporary child set; deciding a child set as the temporary child set when the distortion is below a threshold; storing a typical pattern as the temporary typical pattern of the child set; and storing a classification rule of the typical pattern as the classification item of the context of the fundamental frequency pattern belonging to the child set.
-
Specification