Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

US 20030187655A1
Filed: 03/27/2003
Published: 10/02/2003
Est. Priority Date: 03/28/2002
Status: Active Grant

First Claim

Patent Images

1. A method for enabling a user to employ a voice response (VR) management application accessible from a remote location by the user, to interact with a VR system through the VR management application, wherein the VR system provides audio command prompts to which appropriate responses must be made in order for the user to successfully interact with the VR system, to achieve a desired interaction with the VR system, the method comprising the steps of:

(a) establishing a logical connection between the user and the remote location;

(b) teaching the VR management application how to recognize and respond to the audio command prompts issued by the VR system, such that the recognition is based on a comparison of an audio command prompt received by the VR management application with stored signatures, each stored signature being based on at least a unique portion of each command prompt likely to be issued by the VR system; and

(c) using the VR management application, automatically performing the following steps to enable the user to achieve the desired interaction;

(i) establishing a logical connection between the VR management application and the VR system;

(ii) receiving an audio signal from the VR system over the logical connection;

(iii) comparing the audio signal that was received with each stored signature, to identify the command prompt being issued by the VR system;

(iv) providing any required response corresponding to the command prompt issued by the VR system; and

(v) repeating the steps of subparagraphs (ii)-(iv) until the desired interaction between the user and the VR system through the VR management application has been achieved.

View all claims

23 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for enabling two computer systems to communicate over an audio communications channel, such as a voice telephony connection. Such a system includes a software application that enables a user'"'"'s computer to call, interrogate, download, and manage a voicemail account stored on a telephone company'"'"'s computer, without human intervention. A voicemail retrieved from the telephone company'"'"'s computer can be stored in a digital format on the user'"'"'s computer. In such a format, the voicemail can be readily archived, or even distributed throughout a network, such as the Internet, in a digital form, such as an email attachment. Preferably a computationally efficient audio recognition algorithm is employed by the user'"'"'s computer to respond to and navigate the automated audio menu of the telephone company'"'"'s computer.

233 Citations

43 Claims

1. A method for enabling a user to employ a voice response (VR) management application accessible from a remote location by the user, to interact with a VR system through the VR management application, wherein the VR system provides audio command prompts to which appropriate responses must be made in order for the user to successfully interact with the VR system, to achieve a desired interaction with the VR system, the method comprising the steps of:
- (a) establishing a logical connection between the user and the remote location;
  
  (b) teaching the VR management application how to recognize and respond to the audio command prompts issued by the VR system, such that the recognition is based on a comparison of an audio command prompt received by the VR management application with stored signatures, each stored signature being based on at least a unique portion of each command prompt likely to be issued by the VR system; and
  
  (c) using the VR management application, automatically performing the following steps to enable the user to achieve the desired interaction;
  
  (i) establishing a logical connection between the VR management application and the VR system;
  
  (ii) receiving an audio signal from the VR system over the logical connection;
  
  (iii) comparing the audio signal that was received with each stored signature, to identify the command prompt being issued by the VR system;
  
  (iv) providing any required response corresponding to the command prompt issued by the VR system; and
  
  (v) repeating the steps of subparagraphs (ii)-(iv) until the desired interaction between the user and the VR system through the VR management application has been achieved.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein the step of establishing a logical connection between the VR management application and the VR system comprises the step of establishing the logical connection according to a predefined schedule, so that the desired interaction occurs on a scheduled basis.
  - 3. The method of claim 1, wherein the desired interaction comprises retrieving voicemail of the user from the VR system.
  - 4. The method of claim 3, wherein after the VR management application has retrieved the voicemail, further comprising the step of emailing the voicemail to an email account of the user.
  - 5. The method of claim 3, wherein after the VR management application has retrieved the voicemail, further comprising the step of forwarding the voicemail to the user through a telephone call.
  - 6. The method of claim 3, wherein after the VR management application has retrieved the voicemail, further comprising the step of forwarding the voicemail to an answering machine accessed by the user.
  - 7. The method of claim 3, wherein once the VR management application has retrieved the voicemail, further comprising the steps of storing the voicemail that has been retrieved so that the voicemail can subsequently be accessed by the user.
  - 8. The method of claim 7, further comprising the step of enabling the user to access the voicemail that has been retrieved and stored through a telephone call.
  - 9. The method of claim 7, further comprising the step of enabling the user to access the voicemail that has been retrieved and stored, using a Web browser.
  - 10. The method of claim 1, wherein the desired interaction comprises forwarding an audio message from the user to the VR system.
  - 11. The method of claim 1, wherein the step of teaching the VR management application how to recognize and respond to the audio command prompts issued by the VR system comprises the step of generating a corresponding discrete Fourier transform (DFT) for each audio command prompt likely to be issued by the VR system, each stored signature including the corresponding DFT.
  - 12. The method of claim 11, wherein the step of comparing the audio signal received with each stored signature comprises the steps of generating at least one DFT from the audio signal received, and comparing each DFT generated from the audio signal received with DFTs included in the stored signatures.

13. A method for enabling a user to use a Web site to retrieve audio messages left for the user through an audio message facility, wherein the audio message facility provides audio command prompts to which appropriate responses must be made in order to retrieve any audio messages left for the user at the audio message facility, the method comprising the steps of:
- (a) establishing a logical connection between the user and the Web site;
  
  (b) teaching a voice response (VR) management application associated with the Web site how to recognize and provide a correct response to the audio command prompts issued by the audio message facility, where recognition of each audio command prompt is based on a comparison of an audio command prompt with stored signatures, each stored signature being based on at least a portion of each command prompt likely to be issued by the audio message facility; and
  
  (c) performing the following steps to enable the user to access any messages left for the user at the audio message facility;
  
  (i) establishing a logical connection between the VR management application and the audio message facility;
  
  (ii) receiving an audio signal from the audio message facility with the VR management application;
  
  (iii) comparing the audio signal received with each stored signature, to identify the command prompt being issued by the audio message facility;
  
  (iv) using the VR management application for providing any required response to the command prompt that was identified;
  
  (v) repeating the steps identified in subparagraphs (ii)-(iv) until access to messages stored by the audio message facility is obtained;
  
  (vi) retrieving the messages for the user from the audio message facility, so that the messages are available at the Web site;
  
  (vii) converting the messages into a digital format; and
  
  (viii) enabling the user to access the messages in digital format through the Web site.

14. A computationally efficient method for recognizing audio signals, comprising the steps of:
- (a) generating a plurality of known discrete Fourier transforms (DFTs), each known DFT being generated from a different specific audio signal;
  
  (b) receiving an audio signal;
  
  (c) generating at least one unknown DFT from the audio signal received; and
  
  (d) comparing the at least one unknown DFT generated from the audio signal received, with the known DFTs, to identify the audio signal that was received based on a best match of the unknown DFT to one of the known DFTs.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The method of claim 14, further comprising the steps of:
    - (a) storing the audio signal received in an audio buffer; and
      
      (b) separating contents of the audio buffer into a plurality of equally sized samples, before generating the at least one unknown DFT, wherein the step of generating the at least one unknown DFT comprises the step of generating an unknown DFT for each sample, thereby generating a plurality of unknown DFTs that are then compared to the known DFTs to determine the best match.
  - 16. The method of claim 14, further comprising the step of executing a predefined action associated with said one of the known DFTs that was identified in the comparison of the at least one unknown DFT with the known DFTs.
  - 17. The method of claim 16, wherein a different predefined action is associated with each known DFT.
  - 18. The method of claim 14, wherein the step of generating a plurality of known DFTs comprises the steps of determining the audio signals that are likely to be received, and generating a known DFT for each such audio signal.
  - 19. The method of claim 18, wherein the step of generating a known DFT for each such audio signal comprises the steps of:
    - (a) selecting a specific portion of each such audio signal as a reference portion;
      
      (b) separating the reference portion and the portion of the audio signal preceding the reference portion into a plurality of samples;
      
      (c) generating a DFT for each of the plurality of samples; and
      
      (d) selecting one DFT for one of the samples corresponding to the reference portion as the known DFT, based upon a comparison of the samples for the reference portion with the samples preceding the reference portion.
  - 20. A memory medium storing machine instructions for carrying out the steps of claim 14.

21. A system for recognizing an audio signal, comprising:
- (a) a memory in which are stored;
  
  (i) a plurality of machine instructions defining an audio recognition program; and
  
  (ii) a plurality of known discrete Fourier transforms (DFTs), each known DFT corresponding to a different specific audio signal; and
  
  (b) a processor that is coupled to the memory, to access the machine instructions and the known DFTs, said processor executing said machine instructions and thereby implementing a plurality of functions, including;
  
  (i) receiving an audio signal;
  
  (ii) generating at least one unknown DFT based on the audio signal that was received; and
  
  (iii) comparing the at least one unknown DFT with the known DFTs, to identify the audio signal received based on selecting a best match with one of the known DFTs, to determine a specific audio signal corresponding to said one of the known DFTs.

22. A method for interacting with a voice response (VR) system accessible via at least one of a telephonic connection and a network connection, where the VR system provides audio command prompts to which appropriate responses must be made in order to successfully interact with the VR system, the method comprising the steps of:
- (a) connecting a computing device including an interaction management application to the VR system using said at least one of the telephonic connection and the network connection;
  
  (b) receiving an audio communication from the VR system;
  
  (c) generating at least one discrete Fourier transform (DFT) for the audio communication that was received;
  
  (d) comparing the at least one DFT with known DFTs, each known DFT corresponding to a command prompt likely to be received from the VR system;
  
  (e) providing the VR system any required response, if an acceptable level of correlation exists between said at least one DFT for the audio communication that was received and a known DFT; and
  
  (f) repeating the steps defined in subparagraphs (b)-(f) until the desired interaction has been achieved between the computing device and the VR system.
- View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
- - 23. The method of claim 22, further comprising the step of teaching the computing device how to recognize and respond to each command prompt likely to be received from the VR system.
  - 24. The method of claim 23, wherein the step of teaching the computing device how to recognize and respond to each command prompt comprises the steps of:
    - (a) establishing a logical connection between the computing device and the VR system;
      
      (b) receiving an audio communication comprising a command prompt from the VR system;
      
      (c) generating at least one DFT based on the command prompt that was received;
      
      (d) enabling a user to indicate the correct response to the command prompt;
      
      (e) storing the DFT corresponding to the command prompt and a program script enabling the computing device to duplicate the correct response; and
      
      (f) repeating the steps defined in claim 24, subparagraphs (b)-(e), until a DFT and program script have been stored for all command prompts likely to be received from the VR system.
  - 25. The method of claim 22, wherein:
    - (a) the step of receiving the audio communication from the VR system comprises the steps of;
      
      (i) storing the audio communication in at least one audio buffer; and
      
      (ii) separating each audio buffer into a plurality of window buffers;
      
      (b) wherein the step of generating at least one DFT based on the audio communication comprises the step of generating a DFT for each window buffer; and
      
      (c) wherein the step of comparing the at least one DFT with at least one known DFT comprises the step of comparing each window buffer DFT with the at least one known DFT.
  - 26. The method of claim 25, wherein the step of storing the communication in at least one audio buffer comprises the steps of;
    - (a) providing two identically sized audio buffers, each sized to accommodate N samples, N being selected to achieve a desired time resolution; and
      
      (b) sequentially filling each audio buffer with N samples of the audio communication, such that a first audio buffer is filled with relatively older samples, and a second audio buffer is filled with relatively newer samples, in time.
  - 27. The method of claim 25, wherein the step of separating each audio buffer into a plurality of window buffers comprises the steps of;
    - (a) dividing each audio buffer into X identically sized sample windows, where X is equal to N divided by W, each sample window being of size W, such that each sample window includes a whole number of samples, and X is a positive whole number; and
      
      (b) iteratively generating X window buffers using the sample windows, each window buffer being of size N, such that each window buffer comprises X sample windows, and each sequential window buffer includes one sample window not present in a preceding window buffer.
  - 28. The method of claim 22, wherein the VR system is an audio message service, and wherein the desired interaction comprises retrieving audio messages for a user from the audio message service.
  - 29. The method of claim 28, further comprising the step of generating a key for each message received from the message service, said key being stored in association with the message.
  - 30. The method of claim 29, wherein the step of generating a key for each message comprises the steps of:
    - (a) generating a DFT of the message; and
      
      (b) as a function of the DFT, generating a unique key.
  - 31. The method of claim 30, further comprising the steps of checking the key for each message received against each key that was stored, and ignoring each message whose key matches a stored key, because such a match indicates that the message has previously been retrieved.
  - 32. A memory medium having machine instructions for carrying out the steps of claim 22.

33. A system for automatically interacting with a voice response (VR) system, to achieve a desired interaction with the VR system, comprising:
- (a) a memory in which a plurality of machine instructions defining a retrieval application are stored, said memory also storing a plurality of known discrete Fourier transforms (DFTs), each DFT corresponding to a command prompt likely to be received from the VR system; and
  
  (b) a processor that is coupled to the memory to access the machine instructions, said processor executing said machine instructions and thereby implementing a plurality of functions, including;
  
  (i) establishing a logical connection with the VR system;
  
  (ii) receiving an audio communication from the VR system;
  
  (iii) generating at least one DFT for the audio communication;
  
  (iv) comparing the at least one DFT with at least one known DFT, each known DFT corresponding to a different command prompt from a plurality of command prompts likely to be received from the VR system;
  
  (v) if an acceptable level of correlation exists between at least one DFT and a known DFT, then providing the VR system with any required response, said machine instructions comprising a program script required to generate any required response associated with each known DFT; and
  
  (vi) repeating the steps defined in subparagraphs (ii)-(v) until the desired interaction is achieved.

34. A method of training a computing device to interact with a voice response (VR) system, where successful interaction requires providing appropriate audio responses to audio prompts provided by the VR system, the method comprising the steps of:
- (a) executing an interaction management application on the computing device;
  
  (b) establishing a logical connection between the computing device and the VR system using one of a telephonic connection and a network connection;
  
  (c) receiving a first audio command prompt from the VR system in an audio buffer, a correct audio response to the first audio command prompt being required to navigate a menu associated with the VR system;
  
  (d) enabling a user to indicate the correct audio response, such that the correct audio response is stored by the computing device and associated with the first audio command prompt;
  
  (e) generating a discrete Fourier transform (DFT) for at least a portion of first audio command prompt, the DFT enabling the computing device to automatically recognize the first audio command prompt during a subsequent automated interaction with the VR system through a comparison with a subsequent audio signal received from the VR system;
  
  (f) generating a program script that when executed by the computing device produces the correct audio response, and associating said program script with said DFT; and
  
  (g) storing said DFT and the program script, such that said DFT and program script enable the computing device to automatically recognize the first audio command prompt and generate the correct audio response to the first audio command prompt during a subsequent automated interaction with the VR system.
- View Dependent Claims (35, 36, 37, 38)
- - 35. The method of claim 34, further comprising the step of generating a DFT and a program script for each different command prompt likely to be received from the VR system, when navigating a menu associated with the VR system, thereby enabling the computing device to automatically recognize all command prompts likely to be issued by the VR system, and to generate a correct audio response for each such audio command prompt during a subsequent automated interaction with the VR system.
  - 36. The method of claim 35, further comprising the step of creating a plurality of equally sized sample buffers from the first audio command prompt before generating the DFT, wherein the step of generating the DFT comprises the step of generating a DFT for each sample buffer, thereby generating a plurality of sample buffer DFTs.
  - 37. The method of claim 36, further comprising the step of selecting a best DFT from the plurality of sample buffer DFTs, wherein the step of storing the DFT comprises the step of storing only the best DFT as the DFT associated with the first audio command prompt, so that any subsequent identification of the first audio command prompt subsequently received from the VR system is based on the best DFT that was stored.
  - 38. The method of claim 37, further comprising the step wherein the best DFT is readily distinguishable from the DFT corresponding to each different command prompt.

39. A system that learns to automatically interact with a voice response (VR) system, to navigate a menu associated with the VR system by generating audio responses to audio command prompts provided by the VR system, the system comprising:
- (a) a memory in which a plurality of machine instructions defining a VR management application are stored; and
  
  (b) a processor that is coupled to the memory to access the machine instructions and to the display, said processor executing said machine instructions and thereby implementing a plurality of functions, including;
  
  (i) establishing a logical connection with the VR system;
  
  (ii) receiving an audio command prompt from the VR system, a correct audio response to the audio command prompt being required to navigate a menu associated with the VR system;
  
  (iii) enabling a user to provide the correct audio response, such that the correct audio response is associated with the audio command prompt;
  
  (iv) generating a discrete Fourier transform (DFT) for at least a portion of the audio command prompt, the DFT being associated with the audio command prompt, thereby enabling the computing device to automatically recognize the audio command prompt during a subsequent automated interaction with the VR system;
  
  (v) generating a program script that when executed by the processor causes the processor to produce the correct audio response, and associating that program script with the DFT; and
  
  (vi) storing the DFT and the program script in the memory, so that the DFT and program script enable the processor to automatically recognize the audio command prompt and produce the correct audio response to the audio command prompt during a subsequent automated interaction with the VR system.

40. A method for enabling two computing devices to communicate using audio signals, comprising the steps of:
- (a) providing each computing device with;
  
  (i) a plurality of known audio segments, each audio segment corresponding to a specific data token; and
  
  (ii) a plurality of known discrete Fourier transforms (DFTs), each known DFT being produced from a different one of the plurality of known audio segments;
  
  (b) providing a first one of the computing devices with an input corresponding to a known data token;
  
  (c) accessing the known audio segment that corresponds to the known data token that was input at the first one of the computing devices, such that the first one of the computing devices is prepared to transmit the corresponding known audio segment to the other of the computing devices via an audio communications link;
  
  (d) transmitting the audio segment that corresponds to the known data token that was input at the first one of the computing devices to the other of the computing devices, over the audio communications link;
  
  (e) receiving the audio segment with the other of the computing devices;
  
  (f) with the other of the computing devices, generating a DFT for the audio segment received from the first one of the computing devices;
  
  (g) with the other of the computing devices, comparing the DFT generated from the audio segment received with each known DFT, thereby identifying the specific audio segment sent to the other of the computing devices by the first one of the computing devices; and
  
  (h) at the other of the computing devices, reproducing the data token that was input to the first one of the computing devices, as a function of the specific audio segment that was identified by the other of the computing devices.
- View Dependent Claims (41, 42)
- - 41. The method of claim 40, further comprising the step of enabling the other computing device to transmit a response to the computing device that transmitted the audio segment.
  - 42. The method of claim 40, wherein the plurality of known audio segments comprise an audio segment corresponding to each letter of the alphabet, such that each letter of the alphabet is a data token.

43. A method for enabling two computing devices to communicate using audio signals, comprising the steps of:
- (a) providing each computing device with a plurality of known discrete Fourier transforms (DFTs), each known DFT being produced from a different specific audio signal;
  
  (b) providing one of the computing devices with a text input;
  
  (c) processing the text input, such that the computing device receiving the text input produces a phonetic spelling of the text input, letters comprising the phonetic spelling each having corresponding DFTs;
  
  (d) transmitting the phonetic spelling to the other computing device as a series of audio signals, each audio signal representing the phonetic spelling of a different letter of the text input;
  
  (e) receiving the series of audio signals with the other computing device;
  
  (f) generating DFTs for the series of audio signals received from said one of the computing devices;
  
  (g) with said other of the computing devices, determining specific audio signals corresponding to the DFTs that were generated, by comparing the DFTs that were generated with the known DFTs; and
  
  (h) reproducing the text input provided to said one of the computing devices on said other of the computing devices, as a function of the specific audio signals that were determined by said other of the computing devices.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avaya LLC (Avaya Incorporated), Avaya Management L.P. (Avaya Incorporated)
Original Assignee
GotVoice Incorporated (Avaya Incorporated)
Inventors
Dunsmuir, Martin R.M.

Granted Patent

US 7,330,538 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/270
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

H04M 2201/40   using speech recognition sp...

H04M 3/4936   Speech interaction details ...

H04M 3/53325   Interconnection arrangement...

H04M 3/53333   Message receiving aspects

H04M 7/006   Networks other than PSTN/IS...

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

First Claim

23 Assignments

0 Petitions

Accused Products

Abstract

233 Citations

43 Claims

Specification

Solutions

Use Cases

Quick Links

Closed-loop command and response system for automatic communications between interacting computer systems over an audio communications channel

First Claim

23 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

233 Citations

43 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links