Voice recognition apparatus

US 6,604,073 B2
Filed: 09/12/2001
Issued: 08/05/2003
Est. Priority Date: 09/12/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition apparatus for recognizing speech uttered by an operator, comprising:

a portion for performing a speech recognition process on a voice signal corresponding to said speech to thereby acquire vocal phrase data indicating the uttered phrase;

a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of a signal level of said voice signal to thereby output first utterance duration information;

a portion for capturing a mouth of said operator to acquire mouth image data;

a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of said mouth image data to thereby output second utterance duration information; and

a controller for outputting said vocal phrase data as long as said first utterance duration information is approximate to said second utterance duration information.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is a voice recognition apparatus which can prevent an erroneous manipulation due to erroneous voice recognition from being carried out even in a noisy environment. As long as a duration of utterance acquired based on the level of a voice signal uttered by an operator (user) approximately coincides with a duration of utterance acquired based on mouth image data acquired by capturing the mouth of the operator, the voice recognition apparatus outputs vocal-manipulation phrase data as the result of voice recognition.

10 Citations

View as Search Results

13 Claims

1. A speech recognition apparatus for recognizing speech uttered by an operator, comprising:
- a portion for performing a speech recognition process on a voice signal corresponding to said speech to thereby acquire vocal phrase data indicating the uttered phrase;
  
  a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of a signal level of said voice signal to thereby output first utterance duration information;
  
  a portion for capturing a mouth of said operator to acquire mouth image data;
  
  a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of said mouth image data to thereby output second utterance duration information; and
  
  a controller for outputting said vocal phrase data as long as said first utterance duration information is approximate to said second utterance duration information.

2. A speech recognition apparatus for recognizing speech uttered by an operator to thereby acquire vocal phrase data representing a phrase indicated by said speech, comprising:
- a portion for performing a speech recognition process on a voice signal corresponding to said speech to thereby acquire a plurality of vocal phrase data candidates;
  
  a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of a signal level of said voice signal to thereby generate first utterance duration information;
  
  a portion for capturing a mouth of said operator to acquire mouth image data;
  
  a portion for detecting a point of time when said operator has started uttering said speech and a point of time when said operator has ended uttering said speech on the basis of said mouth image data to thereby generate second utterance duration information;
  
  a portion for counting the number of changes in a shape of said mouth in a duration of utterance indicated by said second utterance duration information on the basis of said mouth image data to thereby generate number-of-mouth-shape-change information; and
  
  a portion for selecting that one of said vocal phrase data candidates which has a count of changes in said mouth equal to the count indicated by said number-of-mouth-shape-changes information and outputting said selected vocal phrase data candidate as said vocal phrase data, as long as said first utterance duration information is approximate to said second utterance duration information.

3. A speech recognition apparatus for recognizing words uttered by a speaker, comprising:
- a first detection circuit which detects a talk start time and a talk end time of the speaker on the basis of a speech signal, and thereafter outputs first utterance duration information;
  
  a second detection circuit which detects a talk start time and a talk end time of the speaker on the basis of mouth image data, and thereafter outputs second utterance duration information; and
  
  a controller which receives the outputted first and second utterance duration information and compares at least a portion of the first utterance duration information to at least a portion of the second utterance duration information.
- View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 4. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 3, further comprising:
5. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 4, wherein, when the controller determines that the first utterance duration information and second utterance duration information have a certain relationship, the controller compares the first mouth shape change information to the second mouth shape change information.
6. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 5, wherein, when the controller determines that the first mouth-shape change information and the second mouth shape change information do not have a certain relationship, the controller outputs a signal requesting the speaker to reutter the words.
7. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 5, wherein, when the controller determines that the first utterance duration information and the second utterance duration information do not have a certain relationship, the controller outputs a signal requesting the speaker to reutter the words.
8. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 5, further comprising a circuit which acquires vocal phrase data corresponding to the words uttered by the speaker.
9. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 8, wherein, when the controller determines that the first mouth shape change information and the second utterance duration information have a certain relationship, the controller outputs said vocal phrase data.
10. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 9, wherein, when the controller determines that the first mouth shape change information and the second utterance duration information do not have a certain relationship, the controller outputs a signal requesting the speaker to reutter the words.
11. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 3, wherein, when the controller determines that the first utterance duration information and the second utterance duration information do not have a certain relationship, the controller outputs a signal requesting the speaker to reutter the words.
12. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 3, further comprising a circuit which acquires vocal phrase data corresponding to the words uttered by the speaker.
13. A speech recognition apparatus for recognizing words uttered by a speaker according to claim 12, wherein, when the controller determines that the first utterance duration information and the second utterance duration information have a certain relationship, the controller outputs said vocal phrase data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Pioneer Corporation
Original Assignee
Pioneer Corporation
Inventors
Yoda, Shoutarou
Primary Examiner(s)
To, Doris H.
Assistant Examiner(s)
Nolan, Daniel A.

Application Number

US09/949,858
Publication Number

US 20020035475A1
Time in Patent Office

692 Days
Field of Search

704/231, 704/251, 704/248, 704/236, 704/252, 704/255, 704/256, 704/235, 704/260, 704/266
US Class Current

704/231
CPC Class Codes

G10L 15/24 Speech recognition using no...

Voice recognition apparatus

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

10 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Voice recognition apparatus

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

10 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links