System and method for generating videoconference transcriptions
First Claim
1. A method for generating a transcription of a videoconference, comprising:
- matching human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference;
determining a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory;
if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and
generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants of the videoconference determined to be the most likely source of the portion of human speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for generating a transcription of a videoconference includes matching human speech of a videoconference to writable symbols. The human speech is encoded in audio data of the videoconference. The writable symbols are parsed into a plurality of statements. For each statement of the plurality of statements, user profile data stored in computer-readable memory is used to determine which participant of a plurality of participants of the videoconference is most likely the source of the statement. A transcription of the videoconference is generated that identifies for each statement the determination of which participant of the plurality of participants of the videoconference is most likely the source of the statement.
-
Citations
20 Claims
-
1. A method for generating a transcription of a videoconference, comprising:
-
matching human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference; determining a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory; if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants of the videoconference determined to be the most likely source of the portion of human speech. - View Dependent Claims (2, 3, 4, 5, 6, 15, 16)
-
-
7. A non-transitory computer-readable memory storing logic, the logic operable when executed by one or more processors to:
-
match human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference; determine a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory; if the probability is less than a predetermined threshold, use video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and generate a transcription of the videoconference that identifies for each statement the determination of which participant of the plurality of participants of the videoconference is most likely the source of the statement. - View Dependent Claims (8, 9, 10, 11, 12, 17, 18)
-
-
13. A method for generating a transcription of a videoconference, comprising:
-
matching human speech of a videoconference to writable symbols, the human speech encoded in an audio data stream of the videoconference; determining a probability that a portion of the human speech matches a voice profile of a participant of a plurality of participants of the videoconference, the voice profile stored in tangible computer-readable memory; if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech, the video data corresponding to the portion of the human speech; and generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants determined to be the most likely source of the portion of the human speech. - View Dependent Claims (14, 19, 20)
-
Specification