Speech recognition method and speech recognition apparatus

US 20030216912A1
Filed: 04/23/2003
Published: 11/20/2003
Est. Priority Date: 04/24/2002
Status: Abandoned Application

First Claim

Patent Images

1. A speech recognition method comprising:

analyzing an input speech input a plurality of times to recognize the input speech and generate a plurality of recognized speech information items;

detecting a rephrased speech information item corresponding to a rephrased speech from the recognition speech information items;

detecting a recognition error in units of a character string from an original speech information item corresponding to the rephrased speech information item;

removing an error character string corresponding to the recognition error from the original speech information item; and

generating a speech recognition result by using the rephrased speech information item and the original speech information item from which the error character string is removed.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method comprises analyzing an input speech input a plurality of times to recognize the input speech and generate a plurality of recognized speech information items, detecting a rephrased speech information item corresponding to a rephrased speech from the recognition speech information items, detecting a recognition error in units of a character string from an original speech information item corresponding to the rephrased speech information item, removing an error character string corresponding to the recognition error from the original speech information item, and generating a speech recognition result by using the rephrased speech information item and the original speech information item from which the error character string is removed.

Citations

20 Claims

1. A speech recognition method comprising:
- analyzing an input speech input a plurality of times to recognize the input speech and generate a plurality of recognized speech information items;
  
  detecting a rephrased speech information item corresponding to a rephrased speech from the recognition speech information items;
  
  detecting a recognition error in units of a character string from an original speech information item corresponding to the rephrased speech information item;
  
  removing an error character string corresponding to the recognition error from the original speech information item; and
  
  generating a speech recognition result by using the rephrased speech information item and the original speech information item from which the error character string is removed.
- View Dependent Claims (2, 3)
- - 2. A speech recognition method according to claim 1, wherein the rephrased speech includes an emphasis speech.
  - 3. A speech recognition method according to claim 1, wherein generating the speech recognition result includes combining the original speech information item from which the error character string is removed with a rephrased character string of the rephrased speech information item, the rephrased character string corresponding to the error character string.

4. A speech recognition method comprising:
- receiving an input speech a plurality of times to generate a plurality of input speech signals corresponding to an original speech and a rephrased speech;
  
  analyzing the input speech signals to output feature information expressing a feature of the input speech;
  
  collating the feature information with a dictionary storage to extract at least one recognition candidate information similar to the feature information;
  
  storing the feature information corresponding to the input speech and the extracted candidate information in a history storage;
  
  outputting interval information based on the feature information corresponding to at least two of the input speech signals and the extracted candidate information, referring to the history storage, the interval information representing at least one of one of a coincident interval and a similar speech interval and one of a non-similar interval and a non-coincident interval with respect to the rephrased speech and the original speech; and
  
  reconstructing the input speech using the candidate information of the rephrased speech and the original speech based on the interval information.
- View Dependent Claims (5, 6, 7, 8, 9)
- - 5. The speech recognition method according to claim 4, wherein outputting the interval information includes analyzing at least one of prosodic features including an speech speed of the input speech, an utterance strength, a pitch representing a frequency variation, an appearance of a pause corresponding to an unvoiced interval, a quality of voice, and an utterance way.
  - 6. The speech recognition method according to claim 4, wherein outputting the interval information includes analyzing at least one of waveform information, feature information and candidate information that concern to the rephrased speech, to detect a specific expression for error correction and to output the interval information.
  - 7. The speech recognition method according to claim 4, wherein outputting the interval information includes extracting emphasis interval information representing an interval during which emphasis utterance is performed, by analyzing at least one of waveform information, feature information and candidate information that correspond to the rephrased speech, and reconstructing the input speech including reconstructing the input speech from the candidate information on the rephrased speech and the original speech, based on at least one of the interval information and the emphasis interval information.
  - 8. The speech recognition method according to claim 7, wherein outputting the interval information includes analyzing at least one of prosodic features including a speech speed of the speech, an utterance strength, a pitch representing a frequency variation, an appearance of a pause corresponding to an unvoiced interval, a quality of voice, and an utterance way, to extract the emphasis interval information.
  - 9. The speech recognition method according to. Claim 7, wherein extracting the emphasis interval information includes detecting a specific expression for correction to extract the emphasis interval information

10. A speech recognition apparatus comprising:
- an input speech analyzer to analyze an input speech input a plurality of times to recognize the input speech and generate a plurality of recognized speech information items;
  
  a rephrased speech detector to detect a rephrased speech information item corresponding to a rephrased speech from the recognition speech information items;
  
  a recognition error detector to detect a recognition error in units of a character string from an original speech information item corresponding to the rephrased speech information item;
  
  an error remover to remove an error character string corresponding to the recognition error from the original speech information item; and
  
  a reconstruction unit to reconstruct the input speech by using the rephrased speech information item and the original speech information item from which the error character string is removed.
- View Dependent Claims (11, 12)
- - 11. A speech recognition apparatus according to claim 10, wherein the rephrased speech includes an emphasis speech.
  - 12. A speech recognition apparatus according to claim 10, wherein the reconstruction unit includes a combination unit to combine the original speech information item from which the error character string is removed with a rephrased character string of the rephrased speech information item, the rephrased character string corresponding to the error character string.

13. A speech recognition apparatus comprising:
- a speech input unit to receive an input speech a plurality of times to generate a plurality of input speech signals corresponding to an original speech and a rephrased speech;
  
  a speech analysis unit to analyze the input speech signal to output feature information expressing a feature of the input speech;
  
  a dictionary storage which stores recognition candidate information;
  
  a collation unit configured to collate the feature information with the dictionary storage to extract at least one recognition candidate information similar to the feature information;
  
  a history storage to store the feature information corresponding to the input speech and the extracted candidate information;
  
  an interval information output unit to output interval information based on the feature information corresponding to at least two of the input speech signals and the extracted candidate information, referring to the history storage, the interval information representing at least one of one of a coincident interval and a similar speech interval and one of a non-similar interval and a non-coincident interval with respect to the rephrased speech and the original speech; and
  
  a reconstruction unit to reconstruct the input speech using the candidate information of the rephrased speech and the original speech based on the interval information.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The speech recognition apparatus according to claim 13, wherein the interval information output unit includes an analyzer to analyze at least one of prosodic features including a speech speed of the input speech, an utterance strength, a pitch representing a frequency variation, an appearance of a pause corresponding to an unvoiced interval, a quality of voice, and an utterance way.
  - 15. The speech recognition apparatus according to claim 13, wherein the interval information output unit includes an analyzer to analyze at least one of waveform information, feature information and candidate information that concern to the rephrased speech, to detect a specific expression for error correction and to output the interval information.
  - 16. The speech recognition apparatus according to claim 13, wherein the interval information output unit includes an emphasis interval extractor to extract emphasis interval information representing an interval during which emphasis utterance is performed, by analyzing at least one of waveform information, feature information and candidate information that correspond to the rephrased speech, and the reconstruction unit includes a reconstruction unit to reconstruct the input speech from the candidate information on the rephrased speech and the original speech, based on at least one of the interval information and the emphasis interval information.
  - 17. The speech recognition apparatus according to claim 16, wherein the interval information output unit includes an analyzer to analyze at least one of prosodic features including a speech speed of the speech, an utterance strength, a pitch representing a frequency variation, an appearance of a pause corresponding to an unvoiced interval, a quality of voice, and an utterance way, to extract the emphasis interval information.
  - 18. The speech recognition apparatus according to claim 16, wherein the analyzer includes a detector to detect a specific expression for correction to extract the emphasis interval information

19. A speech recognition program stored on a computer readable medium comprising:
- means for instructing a computer to analyze an input speech input a plurality of times to recognize the input speech and generate a plurality of recognized speech information items;
  
  means for instructing the computer to detect a rephrased speech information item corresponding to a rephrased speech from the recognition speech information items;
  
  means for instructing the computer to detect a recognition error in units of a character string from an original speech information item corresponding to the rephrased speech information item;
  
  means for instructing the computer to remove an error character string corresponding to the recognition error from the original speech information item; and
  
  means for instructing the computer to generate a speech recognition result by using the rephrased speech information item and the original speech information item from which the error character string is removed.

20. A speech recognition program stored on a computer readable medium comprising:
- means for instructing the computer to take in an input speech a plurality of times to generate a plurality of input speech signals corresponding to an original speech and a rephrased speech;
  
  means for instructing the computer to analyze the input speech signal to output feature information expressing a feature of the input speech;
  
  means for instructing the computer to collate the feature information with a dictionary storage to extract at least one recognition candidate information similar to the feature information;
  
  means for instructing the computer to store the feature information corresponding to the input speech and the extracted candidate information in a history storage;
  
  means for instructing the computer to output interval information based on the feature information corresponding to at least two of the input speech signals and the extracted candidate information, referring to the history storage, the interval information representing at least one of one of a coincident interval and a similar speech interval and one of a non-similar interval and a non-coincident interval with respect to the rephrased speech and the original speech; and
  
  means for instructing the computer to reconstruct the input speech using the candidate information of the rephrased speech and the original speech based on the interval information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Chino, Tetsuro

Application Number

US10/420,851
Publication Number

US 20030216912A1
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/22 Procedures used during a sp...

G10L 2015/227 of the speaker; Human-fact...

Speech recognition method and speech recognition apparatus

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition method and speech recognition apparatus

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links