Conference transcription based on conference data
First Claim
1. A method comprising:
- receiving, with a collaboration server device hosting a current conference for a plurality of conference participants, conference data from at least one of the plurality of conference participants, the conference data comprising a shared material that is shared among the plurality of conference participants during the current conference and that includes data indicative of words or phrases discussed among the plurality of conference participants during the current conference when referencing the shared material, the shared material being other than data identifying the conference participants and other than audio generated during the current conference;
after receiving the shared material, sending, with the collaboration server device, the words or phrases of the shared material to a speech recognition engine to update a language model of the speech recognition engine with the words or phrases in order to improve an accuracy of a transcription of an output media stream of the current conference generated by the speech recognition engine upon receiving the output media stream;
receiving, with the collaboration server device, a plurality of input media streams from the plurality of conference participants generated during the current conference;
generating, with the collaboration server device, the output media stream from the plurality of input media streams;
sending, with the collaboration server device, the output media stream to the speech recognition engine for generation of the transcription of the output media stream using the updated language model;
receiving, with the collaboration server device, from one of the plurality of conference participants, mode data indicating whether the updated language model is to be used for only the current conference or also for a future conference; and
when the mode data indicates that the updated language model is to be used also for the future conference, sending, with the collaboration server device, a command to the speech recognition engine, the command indicating to the speech recognition engine to store the updated language model for the future conference.
1 Assignment
0 Petitions
Accused Products
Abstract
In one implementation, a collaboration server is a conference bridge or other network device configured to host an audio and/or video conference among a plurality of conference participants. The collaboration server sends conference data and a media stream including speech to a speech recognition engine. The conference data may include the conference roster or text extracted from documents or other files shared in the conference. The speech recognition engine updates a default language model according to the conference data and transcribes the speech in the media stream based on the updated language model. In one example, the performance of default language model, the updated language model, or both may be tested using a confidence interval or submitted for approval of the conference participant.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving, with a collaboration server device hosting a current conference for a plurality of conference participants, conference data from at least one of the plurality of conference participants, the conference data comprising a shared material that is shared among the plurality of conference participants during the current conference and that includes data indicative of words or phrases discussed among the plurality of conference participants during the current conference when referencing the shared material, the shared material being other than data identifying the conference participants and other than audio generated during the current conference; after receiving the shared material, sending, with the collaboration server device, the words or phrases of the shared material to a speech recognition engine to update a language model of the speech recognition engine with the words or phrases in order to improve an accuracy of a transcription of an output media stream of the current conference generated by the speech recognition engine upon receiving the output media stream; receiving, with the collaboration server device, a plurality of input media streams from the plurality of conference participants generated during the current conference; generating, with the collaboration server device, the output media stream from the plurality of input media streams; sending, with the collaboration server device, the output media stream to the speech recognition engine for generation of the transcription of the output media stream using the updated language model; receiving, with the collaboration server device, from one of the plurality of conference participants, mode data indicating whether the updated language model is to be used for only the current conference or also for a future conference; and when the mode data indicates that the updated language model is to be used also for the future conference, sending, with the collaboration server device, a command to the speech recognition engine, the command indicating to the speech recognition engine to store the updated language model for the future conference. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A collaboration server device comprising:
-
a memory storing conference data associated with a current conference received from at least one of a plurality of conference participants, the conference data comprising a shared material that is shared among the plurality of conference participants during the current conference and that includes data indicative of words or phrases discussing among the plurality of conference participants during the current conference when referencing the shared material, the shared material being other than data identifying the conference participants and other than audio generated during the current conference; a collaboration server controller configured to; host the current conference and allow the shared material to be shared among the plurality of conference participants during the current conference; obtain words or phrases from the shared material stored in the memory; generate an output media stream from a plurality of input media streams received from the plurality of conference participants; send, via a communication interface, the output media stream and the words or phrases to a speech recognition engine to update a language model of the speech recognition engine with the words or phrases in order to improve an accuracy of a transcription of the output media stream generated by the speech recognition engine; receive mode data indicating to store the updated language model for a future conference; and in response to the mode data indicating to store the updated language model for a future conference, index the updated language model by conference topic with one or more prior language models based on a comparison of the shared material shared during the current conference with prior shared material shared during one or more prior conferences, wherein the memory is configured to store the updated language model according to the index. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer readable storage medium comprising computer-executable instructions comprising:
-
instructions executable by a collaboration server device hosting a current conference to receive shared data that is shared among a plurality of conference participants during the current conference, the shared data received from at least one of the plurality of conference participants and including data indicative of words or phrases discussed among the plurality of conference participants during the current conference when referencing the shared data and being other than data identifying the conference participants and other than audio generated during the current conference; instructions executable by the collaboration server device to extract the words or phrases from the shared data; instructions executable by the collaboration server device to send the words or phrases to a speech recognition engine to update a default language model with the words or phrases in order to improve an accuracy of a transcription of an output media stream from the current conference; instructions executable by the collaboration server device to receive from one of the plurality of conference participants mode data indicating whether the updated language model is to be used for only the current conference or also for a future conference; and instructions executable by the collaboration server device to send a command to the speech recognition engine when the mode data indicates that the updated language model is to be used also for the future conference, the command indicating to the speech recognition engine to store the updated language model for the future conference. - View Dependent Claims (20)
-
Specification