METHODS AND APPARATUS FOR NATURAL SPOKEN LANGUAGE SPEECH RECOGNITION
First Claim
1. A speech recognition apparatus comprising:
- an acoustic processor which converts an input analog speech signal into a digital signal;
a first storer which stores an acoustic model that has learned a feature of speech;
a second storer which stores a dictionary wherein an appearance frequency of a predetermined word relative to another predetermined word and/or word sequence is written; and
a recognizer which uses said acoustic model and said dictionary to calculate a probability value for said digital signal, and which recognizes a word having the maximum probability value as input speech, wherein said recognizer predicts a word to be predicted based on a structure of a sentence including said word, and employs said appearance frequency to calculate said probability value for said sentence, including said word that is predicted.
1 Assignment
0 Petitions
Accused Products
Abstract
A word prediction apparatus and method that improves the precision accuracy, and a speech recognition method and an apparatus therefor are provided. For the prediction of a sixth word “?”, a partial analysis tree having a modification relationship with the sixth word is predicted. “sara-ni sho-senkyoku no” has two partial analysis trees, “sara-ni” and “sho-senkyoku no”. It is predicted that “sara-ni” does not have a modification relationship with the sixth word, and that “sho-senkyoku no” does. Then, “donyu”, which is the sixth word from “sho-senkyoku no”, is predicted. In this example, since “sara-ni” is not useful information for the prediction of “donyu”, it is preferable that “donyu” be predicted only by “sho-senkyoku no”.
-
Citations
19 Claims
-
1. A speech recognition apparatus comprising:
-
an acoustic processor which converts an input analog speech signal into a digital signal; a first storer which stores an acoustic model that has learned a feature of speech; a second storer which stores a dictionary wherein an appearance frequency of a predetermined word relative to another predetermined word and/or word sequence is written; and a recognizer which uses said acoustic model and said dictionary to calculate a probability value for said digital signal, and which recognizes a word having the maximum probability value as input speech, wherein said recognizer predicts a word to be predicted based on a structure of a sentence including said word, and employs said appearance frequency to calculate said probability value for said sentence, including said word that is predicted. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech recognition method comprising:
-
converting an input analog speech signal into a digital signal; storing an acoustic model that has learned a feature of speech; storing a dictionary wherein an appearance frequency of a predetermined word relative to another predetermined word and/or word sequence is written; and recognizing, using said acoustic model and said dictionary to calculate a probability value for said digital signal, a word having the maximum probability value as input speech, wherein said recognizing further comprises; predicting a word to be predicted based on a structure of a sentence including said word; and employing said appearance frequency to calculate said probability value for said sentence, including said word that is predicted. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A program storage device readable by computer, tangibly embodying a program of instructions executable by the computer to perform method steps for speech recognition, said method comprising the steps of:
-
converting an input analog speech signal into a digital signal; storing an acoustic model that has learned a feature of speech; storing a dictionary wherein an appearance frequency of a predetermined word relative to another predetermined word and/or word sequence is written; and recognizing, using said acoustic model and said dictionary to calculate a probability value for said digital signal, a word having the maximum probability value as input speech, wherein said recognizing further comprises; predicting a word to be predicted based on a structure of a sentence including said word; and employing said appearance frequency to calculate said probability value for said sentence, including said word that is predicted.
-
Specification