Apparatus, method, and computer program product for speech recognition allowing for recognition of character string in speech input

US 20070073540A1
Filed: 03/15/2006
Published: 03/29/2007
Est. Priority Date: 09/27/2005
Status: Active Grant

First Claim

Patent Images

1. A speech recognition apparatus comprising:

a generation unit configured to receive a speech utterance and to generate at least one recognition candidate associating to the speech utterance and a likelihood of the recognition candidate;

a storing unit configured to store at least the one recognition candidate and the likelihood;

a selecting unit configured to select one of at least the one recognition candidate as a recognition result of a first speech utterance based on the likelihood;

an utterance relation determining unit configured to determine, when a first speech utterance and a second speech utterance are sequentially input, at least whether the second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance;

a whole correcting unit configured to correct the recognition candidate of the whole of the first speech utterance based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance; and

a part correcting unit configured to correct the recognition candidate for the part of the first speech utterance, the part corresponding to the second speech utterance, based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the part of the first speech utterance.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition apparatus includes a generation unit configured to receive a speech utterance and to generate at least one recognition candidate associating to the speech utterance and a likelihood of the recognition candidate; a storing unit configured to store at least the one recognition candidate and the likelihood; a selecting unit configured to select one of at least the one recognition candidate as a recognition result of a first speech utterance based on the likelihood; an utterance relation determining unit configured to determine, when a first speech utterance and a second speech utterance are sequentially input, at least whether the second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance; a whole correcting unit configured to correct the recognition candidate of the whole of the first speech utterance based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance; and a part correcting unit configured to correct the recognition candidate for the part of the first speech utterance, the part corresponding to the second speech utterance, based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the part of the first speech utterance.

246 Citations

15 Claims

1. A speech recognition apparatus comprising:
- a generation unit configured to receive a speech utterance and to generate at least one recognition candidate associating to the speech utterance and a likelihood of the recognition candidate;
  
  a storing unit configured to store at least the one recognition candidate and the likelihood;
  
  a selecting unit configured to select one of at least the one recognition candidate as a recognition result of a first speech utterance based on the likelihood;
  
  an utterance relation determining unit configured to determine, when a first speech utterance and a second speech utterance are sequentially input, at least whether the second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance;
  
  a whole correcting unit configured to correct the recognition candidate of the whole of the first speech utterance based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance; and
  
  a part correcting unit configured to correct the recognition candidate for the part of the first speech utterance, the part corresponding to the second speech utterance, based on the second speech utterance and to display the corrected recognition result when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the part of the first speech utterance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The speech recognition apparatus according to claim 1, wherein the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance when the utterance relation determining unit detects a similar portion in the first speech utterance and the similar portion matches with the whole of the first speech utterance, the similar portion being a portion in which a degree of similarity between speech information of the first speech utterance and speech information of the second speech utterance is higher than a predetermined threshold value, and the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the part of the first speech utterance when the similar portion matches with the part of the first speech utterance and the similar portion matches with a whole of the second speech utterance.
  - 3. The speech recognition apparatus according to claim 1, wherein the whole correcting unit integrates the recognition candidate for the first speech utterance and the recognition candidate for the second speech utterance with each other, when the recognition candidate for the first speech utterance and the recognition candidate for the second speech utterance are common, calculates a new likelihood based on the likelihood of the common recognition candidate for the first speech utterance and the likelihood of the common recognition candidate for the second speech utterance, and outputs the new likelihood to the storing unit.
  - 4. The speech recognition apparatus according to claim 1, wherein the part correcting unit, when the recognition candidate for a part of the first speech utterance and the recognition candidate for the second speech utterance are common, calculates a new likelihood based on the likelihood of the common recognition candidate for the first speech utterance and the likelihood of the common recognition candidate for the second speech utterance, the part of the first speech utterance corresponding to the speech re-utterance by the second speech utterance, and outputs the likelihood to the storing unit.
  - 5. The speech recognition apparatus according to claim 1, wherein the part correcting unit outputs the recognition candidate to the storing unit, the recognition candidate being obtained by replacing a portion in the first speech utterance with the recognition candidate for the second speech utterance, the portion corresponding to the speech re-utterance by the second speech utterance.
  - 6. The speech recognition apparatus according to claim 3, wherein the whole correcting unit reduces the likelihood of the recognition result corresponding to a portion in the first speech utterance, the portion being a portion at which a speech utterance immediately prior to the first speech utterance is corrected by the first speech utterance.
  - 7. The speech recognition apparatus according to claim 1, wherein the part correcting unit increases the likelihood of the recognition result corresponding to a portion in the first speech utterance, when the portion is not re-uttered in the second speech utterance, the portion being a portion at which a speech utterance immediately prior to the first speech utterance is corrected by the first speech utterance.
  - 8. The speech recognition apparatus according to claim 1, wherein the part correcting unit reduces the likelihood of the recognition result corresponding to a portion in the first speech utterance, when the portion is re-uttered in the second speech utterance, the portion being a portion at which a speech utterance immediately prior to the first speech utterance is corrected by the first speech utterance.
  - 9. The speech recognition apparatus according to claim 1, wherein the utterance relation determining unit determines whether the recognition candidate for the first speech utterance and the recognition candidate for the second speech utterance have a predetermined relation in unmatched portions of the first speech utterance and the second speech utterance or not, and determines that the second speech utterance is a speech re-utterance of the whole of the first speech utterance, and that in the speech re-utterance, a part of the first speech utterance is replaced with a different speech utterance, when the recognition candidates of the first and the second speech utterances have the predetermined relation, and the whole correcting unit outputs the recognition candidates having the predetermined relation, when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance, and that in the speech re-utterance, the part of the first speech utterance is replaced with the different speech utterance.
  - 10. The speech recognition apparatus according to claim 9, wherein the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance when the utterance relation determining unit detects a similar portion in the first speech utterance and the similar portion matches with the whole of the first speech utterance, the similar portion being a portion in which a degree of similarity between speech information of the first speech utterance and speech information of the second speech utterance is higher than a predetermined threshold value, and the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the part of the first speech utterance when the similar portion matches with the part of the first speech utterance and the similar portion matches with a whole of the second speech utterance, and the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance in which a part of the first speech utterance is replaced with a different speech utterance, when the recognition candidate for the first speech utterance and the recognition candidate for the second speech utterance have the predetermined relation in an unmatched portion, which is a portion other than the similar portion in the first speech utterance.
  - 11. The speech recognition apparatus according to claim 9, wherein the utterance relation determining unit determines whether a relation of synonyms is present as the predetermined relation or not.
  - 12. The speech recognition apparatus according to claim 9, wherein the utterance relation determining unit determines whether a relation of the same translation is present as the predetermined relation or not.
  - 13. The speech recognition apparatus according to claim 9, wherein the utterance relation determining unit determines whether a relation of hierarchical concept is present as the predetermined relation or not.

14. A method of speech recognition comprising:
- receiving a speech utterance;
  
  generating at least one recognition candidate associating to the speech utterance and a likelihood of the recognition candidate;
  
  selecting one of at least the one recognition candidate as a recognition result of a first speech utterance based on the likelihood;
  
  determining, when a first speech utterance and a second speech utterance are sequentially input, at least whether the second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance;
  
  correcting the recognition candidate of the whole of the first speech utterance based on the second speech utterance to display the corrected recognition result, when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance; and
  
  correcting the recognition candidate for the part of the first speech utterance, the part corresponding to the second speech utterance, based on the second speech utterance to display the corrected recognition result, when the second speech utterance is determined to be the speech re-utterance of the part of the first speech utterance.

15. A computer program product having a computer readable medium including programmed instructions for performing a speech recognition process, wherein the instructions, when executed by a computer, cause the computer to perform:
- receiving a speech utterance;
  
  generating at least one recognition candidate associating to the speech utterance and a likelihood of the recognition candidate;
  
  selecting one of at least the one recognition candidate as a recognition result of a first speech utterance based on the likelihood;
  
  determining, when a first speech utterance and a second speech utterance are sequentially input, at least whether the second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance;
  
  correcting the recognition candidate of the whole of the first speech utterance based on the second speech utterance to display the corrected recognition result, when the utterance relation determining unit determines that the second speech utterance is the speech re-utterance of the whole of the first speech utterance; and
  
  correcting the recognition candidate for the part of the first speech utterance, the part corresponding to the second speech utterance, based on the second speech utterance to display the corrected recognition result, when the second speech utterance is determined to be the speech re-utterance of the part of the first speech utterance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation), Toshiba Digital Solutions Corporation (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Hirakawa, Hideki, Chino, Tetsuro

Granted Patent

US 7,983,912 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/252
CPC Class Codes

G10L 15/22 Procedures used during a sp...

Apparatus, method, and computer program product for speech recognition allowing for recognition of character string in speech input

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

246 Citations

15 Claims

Specification

Use Cases

Quick Links

Others

Apparatus, method, and computer program product for speech recognition allowing for recognition of character string in speech input

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

246 Citations

15 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others