Speech collating apparatus and speech collating method
First Claim
1. A speech data collating apparatus comprising:
- data converting means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
template setting means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
correlated area detecting means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
collation determining means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said correlated area detecting means to determine identity between the two speech signals.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech of a registered speaker input from an input unit is converted by a converting unit to a sound spectrogram “A” and stored. As a speech of a speaker to be identified is input from the input unit and converted to a sound spectrogram “B” by the converting unit, a detecting unit detects a partial image including a plurality of templates placed in the registered speech image A by a placing unit, and each of areas on the unknown speech image B in which maximum correlation coefficients are calculated. Then, a determining unit compares a mutual positional relationship of the plurality of templates with a mutual positional relationship of the respective areas in which the maximum correlation coefficients are detected to determine from the degree of difference therebetween the identity between the registered speech and the unknown speech.
28 Citations
18 Claims
-
1. A speech data collating apparatus comprising:
-
data converting means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
template setting means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
correlated area detecting means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
collation determining means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said correlated area detecting means to determine identity between the two speech signals. - View Dependent Claims (2, 3, 4, 5, 6)
wherein said template setting means includes means for setting rectangular templates on the two-dimensional speech characteristics of the multi-gradation level data of the registered speaker read-out from said registered speaker information storing means; - and
wherein said correlated area detecting means includes means for detecting rectangular areas on the two-dimensional speech characteristics of the multi-gradation data of an unknown speaker that have a maximum correlation with regard to the rectangular templates.
-
-
3. The speech collating apparatus according to claim 2,
wherein said registered speaker information storing means includes means for storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker; - and
wherein said correlated area detecting means includes means for detecting rectangular areas on the two-dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
- and
-
4. The speech collating apparatus according to claim 1, further comprising registered speaker information storing means for storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,
wherein said template setting means includes means for setting rectangular templates on the two-dimensional speech characteristics of the multi-level gradation data of an unknown speaker; - and
wherein said correlated area detecting means includes means for detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
- and
-
5. The speech collating apparatus according to claim 1, further comprising speech section detecting means for detecting a speech section in the multi-level gradation data, and
wherein said template setting means includes means for setting templates on the speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting means. -
6. The speech collating apparatus according to claim 1, wherein the multi-level gradation data comprises a sound spectrogram.
-
7. A speech data collating method comprising the following steps of:
-
converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas detected by said correlated area detecting means to determine identity between the two speech signals. - View Dependent Claims (8, 9, 10, 11, 12)
storing the multi-level gradation data corresponding to a speech signal of a registered speaker;
wherein said template setting step comprises setting rectangular templates on the two dimensional speech characteristics of the multi-level gradation data of the registered speaker; and
wherein said correlated area detecting step comprises detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of an unknown speaker that have a maximum correlation with regard to the rectangular templates.
-
-
9. The speech collating method according to claim 8,
wherein said registered speaker information storing step includes storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker; - and
wherein said correlated area detecting step includes detecting rectanaular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard the rectangular templates.
- and
-
10. The speech collating method according to claim 7, further comprising storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,
wherein said template setting step includes setting rectanaular templates on two-dimensional data of an unknown speaker; - and
wherein said correlated area detecting step includes detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
- and
-
11. The speech collating step according to claim 7, further comprising detecting a speech section in the multi-level gradation data,
wherein said template setting step includes setting templates on speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting step. -
12. The speech collating method according to claim 7, wherein the multi-level gradation data comprises a sound spectrogram.
-
13. A program recording medium having computer readable program codes, comprising:
-
first program code means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
second program code means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
third program code means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
fourth program code means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said third program code means to determine identity between the two speech signals. - View Dependent Claims (14, 15, 16, 17, 18)
wherein said second program code means includes program code means for setting rectangular templates on the two-dimensional speech characteristics of the multi-gradation level data of the registered speaker; - and
wherein said third program code means includes program code means for detecting rectanaular areas on the two-dimensional speech characteristics of the multi-gradation data of an unknown speaker that have a maximum correlation with regard to the rectangular templates.
-
-
15. The program recording medium according to claim 14,
wherein said registered speaker information storing program code means includes program code means for storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker; wherein said third program code means includes program code means for detecting rectangular areas on the two-dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
-
16. The program recording medium according to claim 13, further comprising program code means for storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,
wherein said second program code means includes program code means for setting rectangular templates on the two-dimensional speech characteristics of the multi-level gradation data of an unknown speaker; - and
wherein said third program code means includes program code means for detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
- and
-
17. The program recording medium according to claim 13, further comprising speech section detecting program code means for detecting a speech section in the multi-level gradation data,
wherein said second program code means includes program code means for setting templates on the speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting means. -
18. The program recording medium according to claim 13, wherein the multi-level gradation data comprises a sound spectrogram.
Specification