Remote voice recognition
First Claim
1. A system comprising:
- a first user device of a first user, the first user device being configured to perform operations related to a captioning session;
a call-assistant device of a call assistant of a captioning system, the call-assistant device being remotely located from the first user device; and
an administrative system communicatively coupled to and remotely located from the call-assistant device and the first user device, the administrative system being configured to;
spin up a virtual computing environment based on a golden image that is a template for the virtual computing environment, the virtual computing environment being configured to run a captioning software application and being dedicated to the call assistant;
receive, from the first user device, a request to initiate a captioning session;
establish the captioning session with the first user device;
assign the captioning session to the call assistant;
receive, from the first user device and in response to establishing the captioning session, first audio data that is derived from a second user device that is performing a communication session with the first user device;
direct the first audio data to the call-assistant device in response to the administrative system being assigned to the call assistant;
receive from the call-assistant device, second audio data that is related to the first audio data and that is derived from speech of the call assistant;
access, with the captioning software application, voice profile data of the call assistant based on the captioning session being assigned to the call assistant and based on the virtual computing environment being dedicated to the call assistant;
generate, with the captioning software application, caption data that includes a transcription of the second audio data, the captioning software application being configured to use the accessed voice profile data to generate the caption data;
generate, based on the transcription, screen data related to the captioning software application, the screen data including the transcription;
direct the screen data to the call-assistant device; and
direct the caption data to the first user device.
14 Assignments
0 Petitions
Accused Products
Abstract
According to one or more aspects of the present disclosure operations related to performing captioning may include receiving, from a first user device, first audio data. The operations may further include directing the first audio data to a remotely located call-assistant device and receiving, from the call-assistant device, second audio data that is related to the first audio data and that is derived from speech of a call assistant. The operations may also include accessing, with a captioning software application, voice profile data of the call assistant and generating caption data that includes a transcription of the second audio data. The operations may also include generating, based on the transcription, screen data related to the captioning software application, in which the screen data includes the transcription. In addition, the operations may include directing the screen data to the call-assistant device and directing the caption data to the first user device.
-
Citations
23 Claims
-
1. A system comprising:
-
a first user device of a first user, the first user device being configured to perform operations related to a captioning session; a call-assistant device of a call assistant of a captioning system, the call-assistant device being remotely located from the first user device; and an administrative system communicatively coupled to and remotely located from the call-assistant device and the first user device, the administrative system being configured to; spin up a virtual computing environment based on a golden image that is a template for the virtual computing environment, the virtual computing environment being configured to run a captioning software application and being dedicated to the call assistant; receive, from the first user device, a request to initiate a captioning session; establish the captioning session with the first user device; assign the captioning session to the call assistant; receive, from the first user device and in response to establishing the captioning session, first audio data that is derived from a second user device that is performing a communication session with the first user device; direct the first audio data to the call-assistant device in response to the administrative system being assigned to the call assistant; receive from the call-assistant device, second audio data that is related to the first audio data and that is derived from speech of the call assistant; access, with the captioning software application, voice profile data of the call assistant based on the captioning session being assigned to the call assistant and based on the virtual computing environment being dedicated to the call assistant; generate, with the captioning software application, caption data that includes a transcription of the second audio data, the captioning software application being configured to use the accessed voice profile data to generate the caption data; generate, based on the transcription, screen data related to the captioning software application, the screen data including the transcription; direct the screen data to the call-assistant device; and direct the caption data to the first user device. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more processors; and one or more non-transitory computer-readable storage media communicatively coupled to the one or more processors and configured to store instructions that, when executed by the one or more processors, cause the system to perform operations related to a captioning session, the operations comprising; receive, from a first user device, first audio data that is derived from a second user device that is performing a communication session with the first user device, the first user device being configured to perform operations related to a captioning session; direct the first audio data to a remotely located call-assistant device; receive from the call-assistant device, second audio data that is related to the first audio data and that is derived from speech of a call assistant of the call-assistant device; access, with a captioning software application running in a virtual computing environment, voice profile data of the call assistant; spin up the virtual computing environment as a customized virtual computing environment for the call assistant based on the voice profile data and based on a golden image that is a template for the virtual computing environment; generate, with the captioning software application, caption data that includes a transcription of the second audio data, the captioning software application being configured to use the accessed voice profile data to generate the caption data; generate, based on the transcription, screen data related to the captioning software application, the screen data including the transcription; direct the screen data to the call-assistant device; and direct the caption data to the first user device. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method of performing captioning operations, the method being performed by a computing system and comprising:
-
spinning up a first virtual computing environment based on a golden image that is template for the first virtual computing environment and based on first profile data of a first call assistant of a captioning system, the first virtual computing environment being dedicated to the first call assistant and being configured to run a first instance of a captioning software application; spinning up a second virtual computing environment based on the golden image and based on second profile data of a second call assistant of the captioning system, the second virtual computing environment being dedicated to the second call assistant and being configured to run a second instance of the captioning software application; assigning a first captioning session to the first call assistant; assigning the first captioning session to the first virtual computing environment based on the first virtual computing environment being dedicated to the first call assistant and based on the first captioning session being assigned to the first call assistant; assigning a second captioning session to the second call assistant; assigning the second captioning session to the second virtual computing environment based on the second virtual computing environment being dedicated to the second call assistant and based on the second captioning session being assigned to the second call assistant; receiving, by the first instance of the captioning software application, first audio data from a remotely located first call-assistant device of the first call assistant, the first audio data being derived from speech of the first call assistant; receiving, by the second instance of the captioning software application, second audio data from a remotely located second call-assistant device of the second call assistant, the second audio data being derived from speech of the second call assistant; generating, with the first instance of the captioning software application, first caption data that includes a first transcription of the first audio data, the first instance of the captioning software application being configured to use the first profile data to generate the first caption data; generating, with the second instance of the captioning software application, second caption data that includes a second transcription of the second audio data, the second instance of the captioning software application being configured to use the second profile data to generate the second caption data; generating, based on the first transcription, first screen data related to the first instance of the captioning software application, the first screen data including the first transcription; generating, based on the second transcription, second screen data related to the second instance of the captioning software application, the second screen data including the second transcription; directing the first screen data to the first call-assistant device; directing the second screen data to the second call-assistant device; directing the first caption data to a first user device participating in a first communication session with a first other user device; and directing the second caption data to a second user device participating in a second communication session with a second other user device. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A method of performing captioning operations, the method being performed by a computing system and comprising:
-
spinning up a virtual computing environment as a customized virtual computing environment for a call assistant based on voice profile data of the call assistant and based on a golden image that is a template for the virtual computing environment; receiving, from a first user device, first audio data that is derived from a second user device participating in a communication session with the first user device, the first user device being configured to perform operations related to a captioning session; directing the first audio data to a remotely located call-assistant device; receiving from the call-assistant device, second audio data that is related to the first audio data and that is derived from speech of the call assistant; accessing, with a captioning software application running in the virtual computing environment, the voice profile data of the call assistant; generating, with the captioning software application, caption data that includes a transcription of the second audio data, the captioning software application being configured to use the accessed voice profile data to generate the caption data; and directing the caption data to the first user device. - View Dependent Claims (21, 22, 23)
-
Specification