Speech recognition system having word-based and phoneme-based recognition means
First Claim
1. A speech recognition system comprising:
- parameter extracting means for analyzing input speech and extracting a speech parameter from the input speech;
first storage means for storing a word reference pattern;
first word recognizing means for segmenting the speech parameter extracted by said parameter extracting means into units of words and outputting a word speech pattern corresponding to one of the words, and for performing word recognition by matching the word speech pattern with the word reference pattern stored in said first storage means and outputting a first word-recognition result;
second storage means for storing at least one word constituent element reference pattern;
second word recognizing means for segmenting the speech parameter into units of word constituent elements and outputting a word constituent element speech pattern corresponding to one of the word constituent elements, for performing recognition of each of the word constituent elements by matching the word constituent element speech pattern with the word constituent element reference pattern stored in said second storage means and outputting a series of recognized word constituent elements, and for performing word recognition on the basis of the series of recognized word constituent elements and outputting a second word recognition result;
recognition result output means connected to said first and second word recognizing means, for obtaining a final recognition result from the first and second word recognition results from said first and second word recognizing means, said recognition result output means including means for increasing a value representing a contribution of the first word recognition result to the final recognition result to be larger than a value representing a contribution of the second word recognition result to the final recognition result in accordance with an increase in the number of word speech patterns used in learning for the first recognition and means for determining the final recognition result in accordance with the increasing value representing the contribution of the first word recognition result thereto; and
learning means for executing learning for forming a new word reference pattern from the recognition result obtained by said recognition result output means and the word speech pattern, said learning means transferring the new word reference pattern to said first storage means, to store it therein and increase the number of word speech patterns being stored in said first storage means.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system includes a parameter extracting section for extracting a speech parameter of input speech, a first recognizing section for performing recognition processing by word-based matching, and a second recognizing section for performing word recognition by matching in units of word constituent elements. The first word recognizing section segments the speech parameter in units of words to extract a word speech pattern and performs word recognition by matching the word speech pattern with a predetermined word reference pattern. The second word recognizing section performs recognition in units of word constituent elements by using the extracted speech parameter and performs word recognition on the basis of candidates of an obtained word constituent element series. The speech recognition system further includes a recognition result output section for obtaining a recognition result on the basis of the word recognition results obtained by the first and second recognizing sections and outputting the obtained recognition result. The speech recognition system further includes a word reference pattern learning section for performing learning of a word reference pattern on the basis of the recognition result obtained by the recognizing result output section and the word speech pattern.
-
Citations
20 Claims
-
1. A speech recognition system comprising:
-
parameter extracting means for analyzing input speech and extracting a speech parameter from the input speech; first storage means for storing a word reference pattern; first word recognizing means for segmenting the speech parameter extracted by said parameter extracting means into units of words and outputting a word speech pattern corresponding to one of the words, and for performing word recognition by matching the word speech pattern with the word reference pattern stored in said first storage means and outputting a first word-recognition result; second storage means for storing at least one word constituent element reference pattern; second word recognizing means for segmenting the speech parameter into units of word constituent elements and outputting a word constituent element speech pattern corresponding to one of the word constituent elements, for performing recognition of each of the word constituent elements by matching the word constituent element speech pattern with the word constituent element reference pattern stored in said second storage means and outputting a series of recognized word constituent elements, and for performing word recognition on the basis of the series of recognized word constituent elements and outputting a second word recognition result; recognition result output means connected to said first and second word recognizing means, for obtaining a final recognition result from the first and second word recognition results from said first and second word recognizing means, said recognition result output means including means for increasing a value representing a contribution of the first word recognition result to the final recognition result to be larger than a value representing a contribution of the second word recognition result to the final recognition result in accordance with an increase in the number of word speech patterns used in learning for the first recognition and means for determining the final recognition result in accordance with the increasing value representing the contribution of the first word recognition result thereto; and learning means for executing learning for forming a new word reference pattern from the recognition result obtained by said recognition result output means and the word speech pattern, said learning means transferring the new word reference pattern to said first storage means, to store it therein and increase the number of word speech patterns being stored in said first storage means. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech recognition system comprising:
-
parameter extracting means for analyzing input speech and extracting a speech parameter from the input speech; first storage means for storing a word reference speech pattern; first word recognizing means for segmenting the speech parameter extracted by said parameter extracting means into units of words and outputting a word speech pattern corresponding to one of the words, and for performing word recognition by matching the word speech pattern with the word reference pattern stored in said first storage means and outputting a first word-recognition result; second storage means for storing at least one word constituent element reference pattern; second word recognizing means for segmenting the speech parameter into units of word constituent elements and outputting a word constituent element speech pattern corresponding to one of the word constituent elements, for performing recognition of each of the word constituent elements by matching the word constituent element speech pattern with the word constituent element reference pattern stored in said second storage means and outputting a series of recognized word constituent elements, and for performing word recognition on the basis of the series of recognized word constituent elements and outputting a second word recognition result; recognition result output means connected to said first and second word recognizing means, for obtaining a final recognition result from the first and second word recognition results of said first and second word recognizing means; and learning means for executing learning of the word reference pattern on the basis of the recognition result obtained by said recognition result output means and a word speech pattern extracted in the course of the recognition processing, wherein said recognition result output means includes means for increasing a value representing a contribution of the first word recognition result to the final recognition result to be larger than a value representing a contribution of the second word recognition result to the final recognition result as the amount of learning executed by said learning means increases and means for determining the final recognition result in accordance with the increasing value representing the contribution of the first word recognition result thereto. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A speech recognition system comprising:
-
parameter extracting means for analyzing input speech and extracting a speech parameter from the input speech; first reference pattern output means for outputting a word reference pattern; first word recognizing means for segmenting the speech parameter extracted by said parameter extracting means into units of words and outputting a word speech pattern corresponding to one of the words, and for performing word recognition by matching the word speech pattern with the word reference pattern output from said first reference pattern output means and outputting a first word-recognition result; second reference pattern output means for outputting a word constituent element reference pattern; second word recognizing means for segmenting the speech parameter into units of word constituent elements, for outputting a word constituent element speech pattern corresponding to one of the word constituent elements, for performing recognition of each of the word constituent elements by matching the word constituent element speech pattern with the word constituent element reference pattern output from said second reference pattern output means to obtain a series of recognized word constituent elements, and for performing word recognition on the basis of a series of recognized word constituent elements and outputting a second word recognition result; recognition result output means connected to said first and second word recognizing means, for obtaining a final recognition result from the first and second word recognition results from said first and second word recognizing means, said recognition result output means including means for increasing a value representing a contribution of the first word recognition result to the final recognition result to be larger than a value representing a contribution of the second word recognition result to the final recognition result as the number of word speech patterns used in learning for the first word recognition increases and means for determining the final recognition result in accordance with the increasing value representing the contribution of the first word recognition result thereto; and learning means for executing learning for forming a new word reference pattern from the final recognition result obtained by said recognition result output means and the word speech pattern, thereby increasing the number of word reference patterns.
-
Specification