SPEAKER SELECTING DEVICE, SPEAKER ADAPTIVE MODEL CREATING DEVICE, SPEAKER SELECTING METHOD, SPEAKER SELECTING PROGRAM, AND SPEAKER ADAPTIVE MODEL MAKING PROGRAM
1 Assignment
0 Petitions
Accused Products
Abstract
To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment. A speaker score calculating means (22) calculates a long-time speaker score (log likelihood of each of a plurality of speaker models stored in a speaker model storage section (31) with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and calculates a short-time speaker score based on a short-time utterance, for example. A long-time speaker selecting means 23 selects speakers corresponding to a predetermined number of speaker models having a high long-time speaker score. A short-time speaker selecting means 24 selects speakers corresponding to the speaker models, the number of which is smaller than the predetermined number and the short-time speaker sore of which is high, from among the speakers selected by the long-time speaker selecting means 23.
-
Citations
42 Claims
-
1-21. -21. (canceled)
-
22. A speaker selecting device comprising:
-
a speaker model storage means that stores a plurality of speaker models; an acoustic feature value calculating means that calculates a feature value from received voice signals; and a speaker score calculating means that calculates a likelihood of each of the plurality of speaker models stored in the speaker model storage means with respect to the feature value calculated by the acoustic feature value calculating means, wherein the speaker score calculating means calculates a first likelihood and a second likelihood based on the voice signals of two relatively different time lengths, the speaker score calculating means comprises; a first selection means that selects speakers corresponding to a predetermined number of speaker models the first likelihood of which is high; and a second selection means that narrows the speakers selected by the first selection means down to speakers the number of which is smaller than the predetermined number and the second likelihood of which is high, and the speaker score calculating means sequentially outputs information corresponding the speakers selected by the second selection means. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. A speaker selecting method comprising:
-
storing a plurality of speaker models in advance;
calculating a feature value from received voice signals;calculating a first likelihood and a second likelihood based on the voice signals of two relatively different time lengths, for each of the plurality of speaker models stored with respect to the calculated feature value and selecting speakers using the calculated likelihood, the method comprising; selecting speakers corresponding to a predetermined speaker models the first likelihood of which is high; narrowing speakers selected as the speakers corresponding to the predetermined number of speaker models the first likelihood of which is high, down to speaker models the number of which is smaller than the predetermined number and the second likelihood of which is high; and sequentially outputting information corresponding to speakers narrowed down to the speaker models the number of which is smaller than the predetermined number and the second likelihood of which is high. - View Dependent Claims (31, 32, 33, 34)
-
-
35. A storage medium for recording a speaker selecting program for causing a computer which performs a speaker selection processing for selecting speakers using speaker models stored in a speaker model storage means that stores a plurality of speaker models, to execute:
-
a speaker score calculation processing for calculating a first likelihood and a second likelihood based on the voice signals of two relatively different time lengths; a first selection processing for selecting speakers corresponding to a predetermined number of speakers the first likelihood of which is high; a second selection processing for narrowing the speakers selected in the first selection processing down to speaker models the number of which is smaller than the predetermined number and the second likelihood of which is high; and a processing for sequentially outputting information corresponding to the speakers selected in the second selection processing. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42)
-
Specification