Apparatus, method and computer program product for recognizing speech
First Claim
1. A speech recognition apparatus comprising:
- a semantic-relation storage unit that stores semantic relation among words and relevance ratio indicating degree of the semantic relation in association with each other;
a first input accepting unit that accepts an input of a first speech;
a first candidate producing unit that recognizes the first speech and produces first recognition candidates and first likelihood of the first recognition candidates, the first recognition candidates containing a phoneme-string candidate and a word candidate;
a first-candidate selecting unit that selects one of the first recognition candidates as a recognition result of the first speech based on the first likelihood of the first recognition candidates;
a second input accepting unit that accepts an input of a second speech including an object word and a clue word, wherein the first speech includes the object word, the first speech does not include the clue word, and the recognition result of the first speech does not include the object word, and wherein the clue word provides the clue for recognizing the object word and for correcting a portion of the recognition result of the first speech which corresponds to the object word;
a second candidate producing unit that recognizes the second speech and produces second recognition candidates and second likelihood of the second recognition candidates;
a word extracting unit that extracts recognition candidates of the object word and recognition candidates of the clue word from the second recognition candidates;
a second-candidate selecting unit that acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, from the semantic-relation storage unit, and selects one of the second recognition candidates as a recognition result of the second speech based on the acquired relevance ratio;
a correction-portion identifying unit that compares a phoneme-string contained in the recognition result of the first speech with a phoneme-string contained in the recognition candidates of the object word extracted by the word extracting unit, and identifies a portion corresponding to the object word; and
a correcting unit that corrects the identified portion corresponding to the object word with a portion that contains the object word and that is contained in the recognition result of the second speech.
4 Assignments
0 Petitions
Accused Products
Abstract
A speech recognition apparatus includes a first-candidate selecting unit that selects a recognition result of a first speech from first recognition candidates based on likelihood of the first recognition candidates; a second-candidate selecting unit that extracts recognition candidates of a object word contained in the first speech and recognition candidates of a clue word from second recognition candidates, acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, and selects a recognition result of the second speech based on the acquired relevance ratio; a correction-portion identifying unit that identifies a portion corresponding to the object word in the first speech; and a correcting unit that corrects the word on identified portion.
448 Citations
19 Claims
-
1. A speech recognition apparatus comprising:
-
a semantic-relation storage unit that stores semantic relation among words and relevance ratio indicating degree of the semantic relation in association with each other; a first input accepting unit that accepts an input of a first speech; a first candidate producing unit that recognizes the first speech and produces first recognition candidates and first likelihood of the first recognition candidates, the first recognition candidates containing a phoneme-string candidate and a word candidate; a first-candidate selecting unit that selects one of the first recognition candidates as a recognition result of the first speech based on the first likelihood of the first recognition candidates; a second input accepting unit that accepts an input of a second speech including an object word and a clue word, wherein the first speech includes the object word, the first speech does not include the clue word, and the recognition result of the first speech does not include the object word, and wherein the clue word provides the clue for recognizing the object word and for correcting a portion of the recognition result of the first speech which corresponds to the object word; a second candidate producing unit that recognizes the second speech and produces second recognition candidates and second likelihood of the second recognition candidates; a word extracting unit that extracts recognition candidates of the object word and recognition candidates of the clue word from the second recognition candidates; a second-candidate selecting unit that acquires the relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word, from the semantic-relation storage unit, and selects one of the second recognition candidates as a recognition result of the second speech based on the acquired relevance ratio; a correction-portion identifying unit that compares a phoneme-string contained in the recognition result of the first speech with a phoneme-string contained in the recognition candidates of the object word extracted by the word extracting unit, and identifies a portion corresponding to the object word; and a correcting unit that corrects the identified portion corresponding to the object word with a portion that contains the object word and that is contained in the recognition result of the second speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A speech recognition method executed by a processor, the method comprising:
-
accepting a first speech; recognizing, by the processor, the accepted first speech to produce first recognition candidates and first likelihood of the first recognition candidates, the first recognition candidates containing a phoneme-string candidate and a word candidate; selecting, by the processor, one of the first recognition candidates produced for a first speech as the recognition result of the first speech based on the first likelihood of the first recognition candidates; accepting, by the processor, a second speech that includes an object word and a clue word, wherein the first speech includes the object word, the first speech does not include the clue word, and the recognition result of the first speech does not include the object word, and wherein the clue word provides the clue for recognizing the object word and for correcting a portion of the recognition result of the first speech which corresponds to the object word; recognizing, by the processor, the accepted second speech to produce second recognition candidates and second likelihood of the second recognition candidates; extracting, by the processor, recognition candidates of the object word and recognition candidates of the clue word from the produced second recognition candidates; acquiring, by the processor, a relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word from a semantic-relation storage unit that stores therein semantic relation among words and relevance ratio indicating degree of the semantic relation in association with each other; selecting, by the processor, one of the second recognition candidates as the recognition result of the second speech based on the acquired relevance ratio; comparing, by the processor, a phoneme-string contained in the recognition result of the first speech with a phoneme-string contained in the recognition candidates of the object word extracted by the word extracting unit; identifying, by the processor, a portion corresponding to the object word in the first speech; and correcting, by the processor, the identified portion corresponding to the object word with a portion that contains the object word and that is contained in the recognition result of the second speech.
-
-
19. A computer program product having a non-transitory computer readable medium storing therein programmed instructions for recognizing speech, wherein the instructions, when executed by a computer, cause the computer to perform:
-
accepting a first speech; recognizing the accepted first speech to produce first recognition candidates and first likelihood of the first recognition candidates, the first recognition candidates containing a phoneme-string candidate and a word candidate; selecting one of the first recognition candidates produced for a first speech as the recognition result of the first speech based on the first likelihood of the first recognition candidates; accepting a second speech that includes an object word and a clue word, wherein the first speech includes the object word, the first speech does not include the clue word, and the recognition result of the first speech does not include the object word, and wherein the clue word provides the clue for recognizing the object word and for correcting a portion of the recognition result of the first speech which corresponds to the object word; recognizing the accepted second speech to produce second recognition candidates and second likelihood of the second recognition candidates; extracting recognition candidates of the object word and recognition candidates of the clue word from the produced second recognition candidates; acquiring a relevance ratio associated with the semantic relation between the extracted recognition candidates of the object word and the extracted recognition candidates of the clue word from a semantic-relation storage unit that stores therein semantic relation among words and relevance ratio indicating degree of the semantic relation in association with each other; selecting one of the second recognition candidates as the recognition result of the second speech based on the acquired relevance ratio; comparing a phoneme-string contained in the recognition result of the first speech with a phoneme-string contained in the recognition candidates of the object word extracted by the word extracting unit; identifying a portion corresponding to the object word in the first speech; and correcting the identified portion corresponding to the object word with a portion that contains the object word and that is contained in the recognition result of the second speech.
-
Specification