Conversational networking via transport, coding and control conversational protocols

US 7,529,675 B2
Filed: 08/18/2005
Issued: 05/05/2009
Est. Priority Date: 11/01/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A computer readable medium embodying instructions executable by a processor to perform a method for distributed speech processing, the method comprising:

executing an application by a client; and

executing audio processing engines by a server, wherein the audio processing engines are asynchronously programmed by the application to process an audio stream to perform an audio processing function, and wherein the audio processing engines process the audio stream by exchanging audio and control messages between the audio processing engines in an audio processing control flow that is decoupled from, and independent of, application control and application level exchanges, wherein the server is a speech rendering browser for receiving the audio stream from the client, performing the audio processing function for determining a resulting audio stream, and returning the resulting audio stream for playback by the client.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for implementing conversational protocols for distributed conversational networking architectures and/or distributed conversational applications, as well as real-time conversational computing between network-connected pervasive computing devices and/or servers over a computer network. The implementation of distributed conversational systems/applications according to the present invention is based, in part, on a suitably defined conversational coding, transport and control protocols. The control protocols include session control protocols, protocols for exchanging of speech meta-information, and speech engine remote control protocols.

152 Citations

18 Claims

1. A computer readable medium embodying instructions executable by a processor to perform a method for distributed speech processing, the method comprising:
- executing an application by a client; and
  
  executing audio processing engines by a server, wherein the audio processing engines are asynchronously programmed by the application to process an audio stream to perform an audio processing function, and wherein the audio processing engines process the audio stream by exchanging audio and control messages between the audio processing engines in an audio processing control flow that is decoupled from, and independent of, application control and application level exchanges, wherein the server is a speech rendering browser for receiving the audio stream from the client, performing the audio processing function for determining a resulting audio stream, and returning the resulting audio stream for playback by the client.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer readable medium of claim 1, wherein the application generates control messages that configure and control the audio processing engines in a manner that renders the exchange of control messages independent of the application model and location of the engines.
  - 3. The computer readable medium of claim 1, wherein the audio processing engines comprise Web services.
  - 4. The computer readable medium of claim 3, wherein the Web services are described and accessed using WSDL (Web Services Description Language).
  - 5. The computer readable medium of claim 1, wherein the control and application level exchanges are implemented using SOAP (simple object access protocol).
  - 6. The computer readable medium of claim 1, wherein WSFL (Web Services flow language) is used to specify processing flow.

7. A distributed speech processing system, comprising:
- a client computing device executing a conversational application; and
  
  a server device in communication with the client computing device over a network, the server device providing an audio I/O processing Web service, which is programmable by control messages generated by the conversational application to provide audio I/O services for the conversational application and a speech engine Web service, which is programmable by control messages generated by the conversational application to provide speech processing services for the conversational application, wherein the audio I/O and speech processing Web services are programmed to perform audio processing tasks specified by the control messages but wherein the audio I/O and speech processing Web services perform said audio processing tasks by executing a flow control that is decoupled and independent from control of the conversational application.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 8. The system of claim 7, wherein the control messages are encoded using XML (eXtensible Markup Language) and wherein the control messages are exchanged using SOAP (Simple Object Access Protocol).
  - 9. The system of claim 7, wherein each service comprises interfaces that are described using WSDL (Web Services Description Language).
  - 10. The system of claim 9, wherein WSFL (Web Services flow language) is used to specify processing flow of the system.
  - 11. The system of claim 7, wherein the speech engine service provides one of automatic speech processing (ASR) services, text-to-speech (TTS) synthesis services, natural language understanding (NLU) services, and a combination thereof.
  - 12. The system of claim 7, wherein the audio I/O processing service provides speech encoding/decoding services, audio recording services, audio playback services, and a combination thereof.
  - 13. The system of claim 7, further comprising a load manager that allocates and assigns the services for the conversational application, based on control messages generated by the conversational application.
  - 14. The system of claim 7, wherein the services are programmed to negotiate uplink and downlink audio codecs for generating RTP-based audio streams.
  - 15. The system of claim 7, wherein the speech engine services are allocated to the conversational application on one of a call, session, utterance and persistent basis.
  - 16. The system of claim 7, wherein the audio I/O processing and speech engine Web services are discoverable using UDDI (Universal Description, Discovery and Integration).
  - 17. The system of claim 7, wherein services provided by the speech engine service and audio I/O processing service are defined as a collection of ports.
  - 18. The system of claim 17, wherein types of ports comprise audio in, audio out, control in, and control out.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Maes, Stephane H.
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US11/206,557
Publication Number

US 20060041431A1
Time in Patent Office

1,356 Days
Field of Search

704/270.1, 709/227
US Class Current

704/270.1
CPC Class Codes

G10L 15/30   Distributed recognition, e....

H04L 65/1101   Session protocols

H04L 65/1104   Session initiation protocol...

H04L 65/1106   Call signalling protocols; ...

H04L 65/65   Network streaming protocols...

H04L 65/70   Media network packetisation

H04L 67/10   in which an application is ...

H04L 69/32   Architecture of open system...

H04L 69/329   in the application layer [O...

H04M 3/4938   comprising a voice browser ...

H04N 21/6437   Real-time Transport Protoco...

H04N 21/8106   involving special audio dat...

Conversational networking via transport, coding and control conversational protocols

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

152 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Conversational networking via transport, coding and control conversational protocols

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

152 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links