Interactive voice response system
First Claim
1. A method for processing in an interactive voice processing system comprising:
- receiving a voice signal from user interaction;
extracting a plurality of formant values from the voice signal over a time period;
calculating an average of said extracted formant values in the frequency domain over said time period;
locating a reference characteristic matching said average; and
using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.
2 Assignments
0 Petitions
Accused Products
Abstract
This invention relates to an interactive voices recognition system and in particular relates to speech recognition processing within an interactive voice response (IVR) system. One problem with speech recognition in an IVR is that two time intensive tasks, the speech recognition and forming a response based on the result of the recognition are performed one after the other. Each process can take up time of the order of seconds and the total time of the combined processes can be noticeable for the user. There is disclosed a method for processing in an interactive voice processing system comprising: receiving a voice signal from user interaction; extracting a plurality of formant values from the voice signal; calculating an average of the formants; locating look ahead text associated with a closest reference characteristic as an estimate of the full text of the voice signal. Thus the invention requires only acoustic analysis of a first portion of a voice signal to determine a response. Since it does not require full linguistic analysis of the signal to convert into text and then a natural language analysis to extract the useful meaning from the text considerable processing time is saved.
172 Citations
23 Claims
-
1. A method for processing in an interactive voice processing system comprising:
-
receiving a voice signal from user interaction;
extracting a plurality of formant values from the voice signal over a time period;
calculating an average of said extracted formant values in the frequency domain over said time period;
locating a reference characteristic matching said average; and
using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal. - View Dependent Claims (2, 3, 4, 5, 6, 7)
determining a response to the user based on the estimated text of the voice signal.
-
-
7. A method as claimed in claim 6 wherein the response is based on performing a search using keywords of the estimated text.
-
8. A method for processing in an interactive voice processing system comprising:
-
receiving a voice signal from user interaction;
extracting a plurality of formant values from the voice signal;
calculating an average of said formant values in the frequency domain, wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged;
locating a reference characteristic matching said average;
using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal, wherein the text associated with the closest reference represents the keywords of the voice signal;
determining a response to the user based on the estimated text of the voice signal, wherein the response is based on performing a search using keywords of the estimated text;
performing speech recognition on the voice signal;
comparing the text of the speech recognition with the estimated text; and
using the determined response if the text are comparable.
-
-
9. A method for processing in an interactive voice processing system comprising:
-
receiving a voice signal from user interaction;
extracting a plurality of characteristics from a first portion of the voice signal over a time period, wherein the extracted characteristics from the first portion of the voice signal include average formant information;
locating a reference characteristic matching said plurality of characteristics over said time period; and
using text associated with the closest reference characteristic as an estimate of the text of the whole voice signal. - View Dependent Claims (10, 11)
-
-
12. An interactive voice response (IVR) system comprising:
-
means for receiving a voice signal from user interaction;
means for extracting a plurality of formant values from the voice signal over a time period;
means for calculating an average of said extracted formant values in the frequency domain over said time period;
means for locating a reference characteristic matching said average; and
means for using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. An interactive voice response (IVR) system comprising:
-
means for receiving a voice signal from user interaction;
means for extracting a plurality of formant values from the voice signal;
means for calculating an average of said formant values in the frequency domain, wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged over said time period;
means for locating a reference characteristic matching said average; and
means for using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal, wherein the text associated with the closest reference represents the keywords of the voice signal;
means for determining a response to the user based on the estimated text of the voice signal;
means for performing speech recognition on the voice signal;
means for comparing the text of the speech recognition with the estimated text; and
means for using the determined response if the text are comparable.
-
-
20. An interactive voice response (IVR) system comprising:
-
means for receiving a voice signal from a user interaction;
means for extracting a plurality of characteristics from a first portion of the voice signal over a time period, wherein the characteristics from the first portion of the voice signal include average formant information;
means for locating a reference characteristic matching said plurality of characteristics over said time period; and
means for using text associated with the closest reference characteristic as an estimate of the text of the whole voice signal. - View Dependent Claims (21, 22)
-
-
23. A computer program product, stored on a computer-readable storage medium, for executing computer program instructions to carry out the steps of:
-
receiving a voice signal from user interaction;
extracting a plurality of formant values from the voice signal over a time period;
calculating an average of said extracted formant values in the frequency domain over said time period;
locating a reference characteristic matching said average; and
using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.
-
Specification