Speech collating apparatus and speech collating method

US 6,718,306 B1
Filed: 10/17/2000
Issued: 04/06/2004
Est. Priority Date: 10/21/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech data collating apparatus comprising:

data converting means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;

template setting means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;

correlated area detecting means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and

collation determining means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said correlated area detecting means to determine identity between the two speech signals.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech of a registered speaker input from an input unit is converted by a converting unit to a sound spectrogram “A” and stored. As a speech of a speaker to be identified is input from the input unit and converted to a sound spectrogram “B” by the converting unit, a detecting unit detects a partial image including a plurality of templates placed in the registered speech image A by a placing unit, and each of areas on the unknown speech image B in which maximum correlation coefficients are calculated. Then, a determining unit compares a mutual positional relationship of the plurality of templates with a mutual positional relationship of the respective areas in which the maximum correlation coefficients are detected to determine from the degree of difference therebetween the identity between the registered speech and the unknown speech.

28 Citations

View as Search Results

18 Claims

1. A speech data collating apparatus comprising:
- data converting means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
  
  template setting means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
  
  correlated area detecting means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
  
  collation determining means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said correlated area detecting means to determine identity between the two speech signals.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The speech collating apparatus according to claim 1, further comprising registered speaker information storing means for storing multi-level gradation data corresponding to a speech signal of a registered speaker, and
3. The speech collating apparatus according to claim 2,wherein said registered speaker information storing means includes means for storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker;
- and wherein said correlated area detecting means includes means for detecting rectangular areas on the two-dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
4. The speech collating apparatus according to claim 1, further comprising registered speaker information storing means for storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,wherein said template setting means includes means for setting rectangular templates on the two-dimensional speech characteristics of the multi-level gradation data of an unknown speaker;
- and wherein said correlated area detecting means includes means for detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
5. The speech collating apparatus according to claim 1, further comprising speech section detecting means for detecting a speech section in the multi-level gradation data, andwherein said template setting means includes means for setting templates on the speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting means.
6. The speech collating apparatus according to claim 1, wherein the multi-level gradation data comprises a sound spectrogram.

7. A speech data collating method comprising the following steps of:
- converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
  
  setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
  
  detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
  
  comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas detected by said correlated area detecting means to determine identity between the two speech signals.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The speech collating method according to claim 7, further comprising:
9. The speech collating method according to claim 8,wherein said registered speaker information storing step includes storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker;
- and wherein said correlated area detecting step includes detecting rectanaular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard the rectangular templates.
10. The speech collating method according to claim 7, further comprising storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,wherein said template setting step includes setting rectanaular templates on two-dimensional data of an unknown speaker;
- and wherein said correlated area detecting step includes detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
11. The speech collating step according to claim 7, further comprising detecting a speech section in the multi-level gradation data,wherein said template setting step includes setting templates on speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting step.
12. The speech collating method according to claim 7, wherein the multi-level gradation data comprises a sound spectrogram.

13. A program recording medium having computer readable program codes, comprising:
- first program code means for converting two speech signals to two items of multi-level gradation data indicative of two-dimensional speech characteristics of said two speech signals;
  
  second program code means for setting rectangular templates on the two-dimensional speech characteristics of one of said two items of multi-level gradation data;
  
  third program code means for detecting rectangular areas on the two-dimensional speech characteristics of the other of said two items of multi-level gradation data that have a maximum correlation with regard to the rectangular templates; and
  
  fourth program code means for comparing a mutual positional relationship of the templates with a mutual positional relationship of the rectangular areas which are detected by said third program code means to determine identity between the two speech signals.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The program recording medium according to claim 13, further comprising registered speaker information storing program code means for storing multi-level gradation data corresponding to a speech signal of a registered speaker, and
15. The program recording medium according to claim 14,wherein said registered speaker information storing program code means includes program code means for storing data on the rectangular templates set on the two-dimensional speech characteristics of the multi-level gradation data of the registered speaker;
- wherein said third program code means includes program code means for detecting rectangular areas on the two-dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
16. The program recording medium according to claim 13, further comprising program code means for storing multi-level gradation data indicative of two-dimensional speech characteristics of a speech signal of a registered speaker,wherein said second program code means includes program code means for setting rectangular templates on the two-dimensional speech characteristics of the multi-level gradation data of an unknown speaker;
- and wherein said third program code means includes program code means for detecting rectangular areas on the two dimensional speech characteristics of the multi-level gradation data of the unknown speaker that have a maximum correlation with regard to the rectangular templates.
17. The program recording medium according to claim 13, further comprising speech section detecting program code means for detecting a speech section in the multi-level gradation data,wherein said second program code means includes program code means for setting templates on the speech characteristics of the multi-level gradation data in the speech section detected by said speech section detecting means.
18. The program recording medium according to claim 13, wherein the multi-level gradation data comprises a sound spectrogram.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Casio Computer Company Limited
Original Assignee
Casio Computer Company Limited
Inventors
Takeda, Tsuneharu, Satoh, Katsuhiko
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Han, Qi

Application Number

US09/690,669
Time in Patent Office

1,267 Days
Field of Search

704/243, 704/251, 704/246, 382/190
US Class Current

704/246
CPC Class Codes

G10L 17/00   Speaker identification or v...

G10L 17/08   Use of distortion metrics o...

G10L 21/06   Transformation of speech in...

Speech collating apparatus and speech collating method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

28 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Speech collating apparatus and speech collating method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links