Interactive voice response system

US 6,704,708 B1
Filed: 04/20/2000
Issued: 03/09/2004
Est. Priority Date: 12/02/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method for processing in an interactive voice processing system comprising:

receiving a voice signal from user interaction;

extracting a plurality of formant values from the voice signal over a time period;

calculating an average of said extracted formant values in the frequency domain over said time period;

locating a reference characteristic matching said average; and

using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This invention relates to an interactive voices recognition system and in particular relates to speech recognition processing within an interactive voice response (IVR) system. One problem with speech recognition in an IVR is that two time intensive tasks, the speech recognition and forming a response based on the result of the recognition are performed one after the other. Each process can take up time of the order of seconds and the total time of the combined processes can be noticeable for the user. There is disclosed a method for processing in an interactive voice processing system comprising: receiving a voice signal from user interaction; extracting a plurality of formant values from the voice signal; calculating an average of the formants; locating look ahead text associated with a closest reference characteristic as an estimate of the full text of the voice signal. Thus the invention requires only acoustic analysis of a first portion of a voice signal to determine a response. Since it does not require full linguistic analysis of the signal to convert into text and then a natural language analysis to extract the useful meaning from the text considerable processing time is saved.

172 Citations

23 Claims

1. A method for processing in an interactive voice processing system comprising:
- receiving a voice signal from user interaction;
  
  extracting a plurality of formant values from the voice signal over a time period;
  
  calculating an average of said extracted formant values in the frequency domain over said time period;
  
  locating a reference characteristic matching said average; and
  
  using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A method as claimed in claim 1 wherein only a first proportion of the voice signal is used to extract the plurality of measurements.
  - 3. A method as claimed in claim 1 wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged over said time period.
  - 4. A method as claimed in claim 3 wherein the text associated with the closest reference represents the full text of the voice signal.
  - 5. A method as claimed in claim 3 wherein the text associated with the closest reference represents the keywords of the voice signal.
  - 6. A method as claimed in claim 5 further comprising the steps of:
7. A method as claimed in claim 6 wherein the response is based on performing a search using keywords of the estimated text.

8. A method for processing in an interactive voice processing system comprising:
- receiving a voice signal from user interaction;
  
  extracting a plurality of formant values from the voice signal;
  
  calculating an average of said formant values in the frequency domain, wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged;
  
  locating a reference characteristic matching said average;
  
  using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal, wherein the text associated with the closest reference represents the keywords of the voice signal;
  
  determining a response to the user based on the estimated text of the voice signal, wherein the response is based on performing a search using keywords of the estimated text;
  
  performing speech recognition on the voice signal;
  
  comparing the text of the speech recognition with the estimated text; and
  
  using the determined response if the text are comparable.

9. A method for processing in an interactive voice processing system comprising:
- receiving a voice signal from user interaction;
  
  extracting a plurality of characteristics from a first portion of the voice signal over a time period, wherein the extracted characteristics from the first portion of the voice signal include average formant information;
  
  locating a reference characteristic matching said plurality of characteristics over said time period; and
  
  using text associated with the closest reference characteristic as an estimate of the text of the whole voice signal.
- View Dependent Claims (10, 11)
- - 10. A method as claimed in claim 9 wherein the extracted characteristics from the first portion of the voice signal include a first phoneme or first group of phonemes.
  - 11. A method as claimed in claim 10 wherein formant information and the phoneme information is used together to estimate the text of the whole voice signal.

12. An interactive voice response (IVR) system comprising:
- means for receiving a voice signal from user interaction;
  
  means for extracting a plurality of formant values from the voice signal over a time period;
  
  means for calculating an average of said extracted formant values in the frequency domain over said time period;
  
  means for locating a reference characteristic matching said average; and
  
  means for using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. An IVR as claimed in claim 12 wherein only a first proportion of the voice signal is used to extract the plurality of measurements.
  - 14. An IVR as claimed in claim 12 wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged over said time period.
  - 15. An IVR as claimed in claim 14 wherein the text associated with the closest reference represents the full text of the voice signal.
  - 16. An IVR as claimed in claim 14 wherein the text associated with the closest reference represents the keywords of the voice signal.
  - 17. An IVR as claimed in claim 16 further comprising means for determining a response to the user based on the estimated text of the voice signal.
  - 18. An IVR as claimed in claim 17 wherein the response is based on performing a search using keywords of the estimated text.

19. An interactive voice response (IVR) system comprising:
- means for receiving a voice signal from user interaction;
  
  means for extracting a plurality of formant values from the voice signal;
  
  means for calculating an average of said formant values in the frequency domain, wherein a first and second formant (formant centroid) and average excursion from the centroid are extracted and averaged over said time period;
  
  means for locating a reference characteristic matching said average; and
  
  means for using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal, wherein the text associated with the closest reference represents the keywords of the voice signal;
  
  means for determining a response to the user based on the estimated text of the voice signal;
  
  means for performing speech recognition on the voice signal;
  
  means for comparing the text of the speech recognition with the estimated text; and
  
  means for using the determined response if the text are comparable.

20. An interactive voice response (IVR) system comprising:
- means for receiving a voice signal from a user interaction;
  
  means for extracting a plurality of characteristics from a first portion of the voice signal over a time period, wherein the characteristics from the first portion of the voice signal include average formant information;
  
  means for locating a reference characteristic matching said plurality of characteristics over said time period; and
  
  means for using text associated with the closest reference characteristic as an estimate of the text of the whole voice signal.
- View Dependent Claims (21, 22)
- - 21. An IVR as claimed in claims 20 wherein the characteristics from the first portion of the voice signal include a first phoneme or first group of phonemes.
  - 22. An IVR as claimed in claim 21 wherein formant information and the phoneme information is used together to estimate the text of the whole voice signal.

23. A computer program product, stored on a computer-readable storage medium, for executing computer program instructions to carry out the steps of:
- receiving a voice signal from user interaction;
  
  extracting a plurality of formant values from the voice signal over a time period;
  
  calculating an average of said extracted formant values in the frequency domain over said time period;
  
  locating a reference characteristic matching said average; and
  
  using a word or words associated with the closest reference characteristic as an estimate of the text of the voice signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Pickering, John Brian
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US09/552,907
Time in Patent Office

1,419 Days
Field of Search

704/235, 704/243, 704/246, 704/249, 704/251, 704/252, 704/209, 704/207, 704/256, 704/270, 379/88.01
US Class Current

704/235
CPC Class Codes

G10L 15/10   using distance or distortio...

G10L 25/15   the extracted parameters be...

H04M 3/493   Interactive information ser...

Interactive voice response system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

172 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Interactive voice response system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

172 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links