Interactive voice recognition method and apparatus using affirmative/negative content discrimination
First Claim
1. An interactive voice recognition apparatus, comprising:
- a voice input unit to receive voice and translate the received voice into digital form;
a voice analysis unit in communication with said voice input unit to generate characteristic voice data for the received digitized voice;
a word detection unit in communication with said voice analysis unit to determine whether the characteristic voice data substantially matches standard characteristic voice information corresponding to pre-registered expressions and generates detected expression data in response thereto;
an affirrnative/negative discrimination unit in communication with said voice analysis unit to characterize whether the characteristic voice data can be characterized as an affirmative or negative response and generates an affirmative/negative signal in response thereto;
a voice comprehension and conversation control unit in communication with said word detection unit and said affirmative/negative discrimination unit to;
interrogate a recognition mode boolean;
receive the detected data generated by said word detection unit, determine a contextual meaning based on the received detected data, and formulate an appropriate response if the recognition mode boolean is clear;
receive the affirmative/negative signal generated by said affirmative/negative discrimination unit and formulate the appropriate response based on the received affirmative/negative signal and prior responses if the recognition mode boolean is set; and
reset the recognition mode boolean based on the formulated appropriate response; and
a voice synthesizer in communication with said voice comprehension and conversation control unit to generate synthesized audio corresponding to the appropriate response formulated by said voice comprehension and conversation control unit.
1 Assignment
0 Petitions
Accused Products
Abstract
A technique for improving voice recognition in low-cost, speech interactive devices. This technique calls for implementing a affirmative/negative discrimination unit in parallel with a word detection unit to permit comprehension of spoken commands or messages issued by binary questions when no recognizable words are found. Preferably, affirmative/negative discrimination will include either spoken vowel analysis or negative language descriptor detection of the perceived message or command. Other facets include keyword identification within the perceived message or command, confidence match level comparison or correlation table compilation in order to increase recognition accuracy of word-based recognition, volume analysis, and inclusion of ambient environment information in generating responses to perceived messages or queries.
384 Citations
23 Claims
-
1. An interactive voice recognition apparatus, comprising:
-
a voice input unit to receive voice and translate the received voice into digital form; a voice analysis unit in communication with said voice input unit to generate characteristic voice data for the received digitized voice; a word detection unit in communication with said voice analysis unit to determine whether the characteristic voice data substantially matches standard characteristic voice information corresponding to pre-registered expressions and generates detected expression data in response thereto; an affirrnative/negative discrimination unit in communication with said voice analysis unit to characterize whether the characteristic voice data can be characterized as an affirmative or negative response and generates an affirmative/negative signal in response thereto; a voice comprehension and conversation control unit in communication with said word detection unit and said affirmative/negative discrimination unit to; interrogate a recognition mode boolean; receive the detected data generated by said word detection unit, determine a contextual meaning based on the received detected data, and formulate an appropriate response if the recognition mode boolean is clear; receive the affirmative/negative signal generated by said affirmative/negative discrimination unit and formulate the appropriate response based on the received affirmative/negative signal and prior responses if the recognition mode boolean is set; and reset the recognition mode boolean based on the formulated appropriate response; and a voice synthesizer in communication with said voice comprehension and conversation control unit to generate synthesized audio corresponding to the appropriate response formulated by said voice comprehension and conversation control unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An interactive voice recognition method, comprising the steps of:
-
perceiving voice; translating the perceived voice into corresponding digital form; generating characteristic voice data for the perceived digitized voice; determining whether the characteristic voice data generated in said characteristic voice data generating step substantially matches standard characteristic voice information corresponding to pre-registered expressions; generating detected expression data if it is determined in said determining step that the characteristic voice data generated in said characteristic voice data generating step substantially matches standard characteristic voice information corresponding to at least one of the pre-registered expressions; characterizing whether the characteristic voice data generated in said characteristic voice data generating step constitutes either an affirmative or negative statement and generating a content characterization responsive thereto; assimilating a contextual meaning based on the detected expression data generated in said detected expression data generating step; based on a recognition mode, performing one of; formulating an appropriate response based on said assimilated contextual meaning assimilated in said assimilating step if the recognition mode is set for word recognition; and formulating the appropriate response based on the content characterization generated by said characterizing step if the recognition mode is set for affirmative/negative discrimination; resetting the recognition mode based on the formulated appropriate response; and synthesizing audio corresponding to the appropriate formulated response. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification