Voice recognition system having word frequency and intermediate result display features
First Claim
1. A voice recognition system, comprising:
- a microphone for converting a voice to an electrical voice signal having a voice sound portion and a non-voice portion;
acoustic processing means for detecting a power and spectrum of the electrical voice signal in accordance with a predetermined sampling time interval, and outputting power time-series data and spectrum time-series data at the predetermined sampling time interval to produce feature time-series data;
voice section detection means for receiving the power time series data from the acoustic processing means, detecting a start point and an end point of the voice sound portion, and outputting an end decision signal when an end of the voice sound portion is determined;
a word dictionary for storing, corresponding to words, word labels, word numbers corresponding to the word labels, and word templates comprising the feature time-series data corresponding to the word labels, the word labels being ordered in accordance with a frequency of use of the words;
verification means for receiving the feature time-series data of the voice to be verified, verifying the feature time-series data with the word template stored in the word dictionary, and calculating a degree of similarity between the voice and the word template;
sorting means for sorting the words in accordance with the degree of similarity, the data being sorted in order of the higher degree of similarity;
selection means for selecting one or more words having a higher degree of similarity from the words sorted in the sorting means;
display means for displaying the words;
a word frequency dictionary for storing the word labels, the word numbers corresponding to each word label, the word templates comprising the feature time-series data corresponding to each word label, and frequency data attached to each word label; and
word dictionary sorting means, provided between the word dictionary and the word frequency dictionary, for sorting the word labels of the word frequency dictionary in order of higher degree of frequency to obtain sorted words, and outputting the sorted words to the word dictionary.
1 Assignment
0 Petitions
Accused Products
Abstract
A voice recognition system includes a microphone for converting a voice to an electrical voice signal including voice and non-voice sound portions. An acoustic processing unit detects power and spectrum of the electrical voice signal, and outputs power time-series data and spectrum time series data. A voice section detection unit uses the power time-series data to detect a start point and an end point of the voice sound portion, and outputs an end decision signal indicative of such end point. A word dictionary stores word labels ordered in accordance with frequency of use, as well as word numbers and word templates. A recognition unit receives the feature time-series data and calculates a degree of similarity between the voice and the word templates. A sorting unit sorts data calculated in the recognition unit in accordance with the degree of similarity. A selection unit selects one or more words having a higher degree of similarity from words sorted in the sorting unit, and outputs these words to a display unit. A word frequency dictionary stores word labels, word numbers, word templates, and frequency data attached to each word label. Finally, a word dictionary sorting unit coupled between the word dictionary and the word frequency dictionary, sorts the word label of the word frequency dictionary in accordance with the order of higher frequency, and outputs sorted words to the word dictionary.
39 Citations
10 Claims
-
1. A voice recognition system, comprising:
-
a microphone for converting a voice to an electrical voice signal having a voice sound portion and a non-voice portion; acoustic processing means for detecting a power and spectrum of the electrical voice signal in accordance with a predetermined sampling time interval, and outputting power time-series data and spectrum time-series data at the predetermined sampling time interval to produce feature time-series data; voice section detection means for receiving the power time series data from the acoustic processing means, detecting a start point and an end point of the voice sound portion, and outputting an end decision signal when an end of the voice sound portion is determined; a word dictionary for storing, corresponding to words, word labels, word numbers corresponding to the word labels, and word templates comprising the feature time-series data corresponding to the word labels, the word labels being ordered in accordance with a frequency of use of the words; verification means for receiving the feature time-series data of the voice to be verified, verifying the feature time-series data with the word template stored in the word dictionary, and calculating a degree of similarity between the voice and the word template; sorting means for sorting the words in accordance with the degree of similarity, the data being sorted in order of the higher degree of similarity; selection means for selecting one or more words having a higher degree of similarity from the words sorted in the sorting means; display means for displaying the words; a word frequency dictionary for storing the word labels, the word numbers corresponding to each word label, the word templates comprising the feature time-series data corresponding to each word label, and frequency data attached to each word label; and word dictionary sorting means, provided between the word dictionary and the word frequency dictionary, for sorting the word labels of the word frequency dictionary in order of higher degree of frequency to obtain sorted words, and outputting the sorted words to the word dictionary. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A word recognition system comprising:
-
a word dictionary storing words; a word frequency dictionary storing a frequency of use of each word in the word dictionary; voice detection means for detecting a voice word in an input signal, comparing the voice word to the words in said word dictionary, and calculating a degree of similarity between the voice word and each word in the word dictionary; word similarity sorting means for sorting the words from said word dictionary in order of the degree of similarity until the voice word has been detected by said voice detection means; word frequency sorting means for sorting the words in the word dictionary on the basis of each word frequency stored in the word frequency dictionary, said word similarity sorting means obtaining words to be sorted in order of the frequency of use; and selection means for displaying the words for selection in an order that has been obtained by said word similarity sorting mean when the voice word has been detected.
-
-
8. A word frequency selection system in a voice recognition system in which a voice word is detected in an input signal and words from a word dictionary are sorted in order of a degree of similarity to the voice word and displayed in the order for selection, said word frequency selection system comprising:
-
a word frequency dictionary storing a frequency of use of each word in the word dictionary; and word frequency sorting means for sorting the words to be sorted by similarity in the word dictionary on the basis of each word frequency stored in the word frequency dictionary.
-
-
9. A voice recognition system comprising:
-
a microphone converting an input signal to an electrical signal having a voice portion and a non-voice portion; an acoustic processing unit, operatively connected to said microphone, and converting the electrical signal to power time-series data and spectrum time-series data; a voice section detection unit, operatively connected to said acoustic processing unit, and detecting the voice portion and non-voice portion of the electrical signal; a parameter buffer, operatively connected to said acoustic processing unit, temporarily storing the spectrum time series data; a word dictionary storing words and corresponding time-series data; a verification unit, operatively connected to said voice section detection unit, said parameter buffer and said word dictionary, and calculating a degree of similarity between the spectrum time series data corresponding to the voice portion and spectrum time series data corresponding to one of the words in the word dictionary; a sorting unit operatively connected to said voice section detection unit and said verification unit, and sorting words in accordance with the degree of similarity; a temporary memory operatively connected to said sorting unit and storing sorted data from the sorting unit; a selection unit, operatively connected to said temporary memory, said voice section detection unit and said word dictionary, and receiving one of the words corresponding to the data stored in the temporary memory and outputting the word upon detection by the voice section detection unit of completion of the voice portion of the electrical signal; a candidate selection switch, operatively connected to said selection unit and receiving a request for the selection unit to select another word; a display unit operatively connected to said selection unit and displaying the word selected by the selection unit when said candidate selection switch is pressed; a word frequency dictionary storing a frequency of use of the words in said word dictionary; and a word dictionary sorting unit, operatively connected between said word dictionary and said word frequency dictionary, and sorting the words in said word dictionary based on the frequency of use corresponding to those words as stored in said word frequency dictionary.
-
-
10. A method of recognizing a word from an input speech signal, comprising the steps of:
-
a) storing words in a word dictionary; b) storing a frequency of use of each of the words; c) sorting the words in the word dictionary based on the frequency of use of each word; d) detecting a voice word in an input signal and calculating a degree of similarity between the voice word and each of the words in the word dictionary; and e) sorting the words in order of the degree of similarity until a voice word has been detected.
-
Specification