Multimodal stream processing-based cognitive collaboration system
First Claim
1. A system comprising:
- a stream processing engine including a plurality of processor modules configured to perform cognitive processing of multimodal input streams including input audio, input video, and input text, originated at one or more user devices associated with a communication session supported by a collaboration service, wherein the plurality of processor modules includes a speech-to-text module to convert speech in the input audio to converted text, and a natural language processor to derive user intent from the input text and the converted text is available, and wherein the stream processing engine is further configured to derive user requests associated with the communication session based on the user intent and to transmit the user requests over one or more networks; and
a Bot subsystem configured to communicate with the stream processing engine, the collaboration service, and the one or more user devices over the one or more networks, the Bot subsystem including a collection of Bots configured as computer programs that run automated tasks over the one or more networks to implement;
a stream receptor to receive the multimodal input streams from the one or more user devices and direct the multimodal input streams to an appropriate one or ones of the plurality of processor modules of the stream processing engine to enable the stream processing engine to derive the user requests;
a cognitive action interpreter to translate the user requests to corresponding action requests and issue the action requests to the collaboration service so as to initiate actions with respect to the communication session; and
a cognitive responder to transmit, in response to the user requests, multimodal user responses, including audio, video, and text user responses to the one or more user devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A collaboration system includes a stream processing engine and a Bot subsystem. The stream processing engine performs cognitive processing of multimodal input streams originated at one or more user devices in a communication session supported by a collaboration service to derive user-intent-based user requests and transmit the user requests over one or more networks. The Bot subsystem includes a stream receptor directs the multimodal input streams from the user devices to the stream processing engine to enable the stream processing engine to derive the user requests. The Bot subsystem also includes a cognitive action interpreter to translate the user requests to action requests and issue the action requests to the collaboration service so as to initiate actions with respect to the communication session. The Bot subsystem also includes a cognitive responder to transmit, in response to the user requests, multimodal user responses to the one or more user devices.
-
Citations
20 Claims
-
1. A system comprising:
-
a stream processing engine including a plurality of processor modules configured to perform cognitive processing of multimodal input streams including input audio, input video, and input text, originated at one or more user devices associated with a communication session supported by a collaboration service, wherein the plurality of processor modules includes a speech-to-text module to convert speech in the input audio to converted text, and a natural language processor to derive user intent from the input text and the converted text is available, and wherein the stream processing engine is further configured to derive user requests associated with the communication session based on the user intent and to transmit the user requests over one or more networks; and a Bot subsystem configured to communicate with the stream processing engine, the collaboration service, and the one or more user devices over the one or more networks, the Bot subsystem including a collection of Bots configured as computer programs that run automated tasks over the one or more networks to implement; a stream receptor to receive the multimodal input streams from the one or more user devices and direct the multimodal input streams to an appropriate one or ones of the plurality of processor modules of the stream processing engine to enable the stream processing engine to derive the user requests; a cognitive action interpreter to translate the user requests to corresponding action requests and issue the action requests to the collaboration service so as to initiate actions with respect to the communication session; and a cognitive responder to transmit, in response to the user requests, multimodal user responses, including audio, video, and text user responses to the one or more user devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method comprising:
-
at a plurality of processor modules of a stream processing engine, performing cognitive processing of multimodal input streams including input audio, input video, and input text, originated at one or more user devices associated with a communication session supported by a collaboration service, wherein the performing cognitive processing includes converting speech in the input audio to converted text, deriving user intent from the input text and the converted text is available, deriving user requests associated with the communication session based on the user intent, and transmitting the user requests over one or more networks; and at a Bot subsystem configured to communicate with the stream processing engine, the collaboration service, and the one or more user devices over the one or more networks, the Bot subsystem including a collection of Bots configured as computer programs that run automated tasks over the one or more networks; receiving the multimodal input streams from the one or more user devices and directing the multimodal input streams to an appropriate one or ones of the plurality of processor modules of the stream processing engine to enable the stream processing engine to derive the user requests; translating the user requests to corresponding action requests and issuing the action requests to the collaboration service so as to initiate actions with respect to the communication session; and transmitting, in response to the user requests, multimodal user responses, including audio, video, and text user responses to the one or more user devices. - View Dependent Claims (13, 14, 15, 18)
-
-
16. One or more non-transitory processor readable media storing instructions that, when executed by a processor, cause the processor to:
-
implement a stream processing engine including a plurality of processor modules, the instructions to cause the processor to implement the stream processing engine including instructions to cause the processor to perform cognitive processing of multimodal input streams including input audio, input video, and input text, originated at one or more user devices associated with a communication session supported by a collaboration service, wherein the instructions to cause the processor to implement the stream processing engine include instructions to cause the processor to convert speech in the input audio to converted text, derive user intent from the input text and the converted text is available, derive user requests associated with the communication session based on the user intent, and transmit the user requests; and implement a collection of Bots of a Bot subsystem configured to communicate with the stream processing engine, the collaboration service, and the one or more user devices, the instructions to cause the processor to implement the collection of Bots configured as computer programs that run automated tasks over the one or more networks, including instructions to cause the processor to; receive the multimodal input streams from the one or more user devices and direct the multimodal input streams to an appropriate one or ones of the plurality of processor modules of the stream processing engine to enable the stream processing engine to derive the user requests; translate the user requests to corresponding action requests and issue the action requests to the collaboration service so as to initiate actions with respect to the communication session; and transmit, in response to the user requests, multimodal user responses, including audio, video, and text user responses to the one or more user devices. - View Dependent Claims (17, 19, 20)
-
Specification