System and method for generating videoconference transcriptions

US 8,630,854 B2
Filed: 08/31/2010
Issued: 01/14/2014
Est. Priority Date: 08/31/2010
Status: Active Grant

First Claim

Patent Images

1. A method for generating a transcription of a videoconference, comprising:

matching human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference;

determining a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory;

if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and

generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants of the videoconference determined to be the most likely source of the portion of human speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for generating a transcription of a videoconference includes matching human speech of a videoconference to writable symbols. The human speech is encoded in audio data of the videoconference. The writable symbols are parsed into a plurality of statements. For each statement of the plurality of statements, user profile data stored in computer-readable memory is used to determine which participant of a plurality of participants of the videoconference is most likely the source of the statement. A transcription of the videoconference is generated that identifies for each statement the determination of which participant of the plurality of participants of the videoconference is most likely the source of the statement.

Citations

20 Claims

1. A method for generating a transcription of a videoconference, comprising:
- matching human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference;
  
  determining a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory;
  
  if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and
  
  generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants of the videoconference determined to be the most likely source of the portion of human speech.
- View Dependent Claims (2, 3, 4, 5, 6, 15, 16)
- - 2. The method of claim 1, wherein the profile comprises voice profile data of the participant of the plurality of participants of the videoconference.
  - 3. The method of claim 1, wherein the profile comprises visual profile data of the participant of the plurality of participants of the videoconference.
  - 4. The method of claim 1, further comprising generating at least a portion of the profile during the videoconference.
  - 5. The method of claim 1, further comprising generating at least a portion of the profile before the videoconference begins.
  - 6. The method of claim 1, further comprising generating at least a portion of the profile after the videoconference has terminated.
  - 15. The method of claim 1, wherein the profile comprises a location of the participant.
  - 16. The method of claim 1, wherein the profile comprises an address of the participant.

7. A non-transitory computer-readable memory storing logic, the logic operable when executed by one or more processors to:
- match human speech of a videoconference to writable symbols, the human speech encoded in audio data of the videoconference;
  
  determine a probability that a portion of the human speech matches a profile of a participant of a plurality of participants of the videoconference, the profile stored in tangible computer-readable memory;
  
  if the probability is less than a predetermined threshold, use video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech; and
  
  generate a transcription of the videoconference that identifies for each statement the determination of which participant of the plurality of participants of the videoconference is most likely the source of the statement.
- View Dependent Claims (8, 9, 10, 11, 12, 17, 18)
- - 8. The non-transitory computer-readable memory of claim 7, wherein the profile comprises voice profile data of the participant of the plurality of participants of the videoconference.
  - 9. The non-transitory computer-readable memory of claim 7, wherein the profile comprises visual profile data of the participant of the plurality of participants of the videoconference.
  - 10. The non-transitory computer-readable memory of claim 7, wherein the logic is further operable when executed by the one or more processors to generate at least a portion of the profile during the videoconference.
  - 11. The non-transitory computer-readable memory of claim 7, wherein the logic is further operable when executed by the one or more processors to generate at least a portion of the profile before the videoconference begins.
  - 12. The non-transitory computer-readable memory of claim 7, wherein the logic is further operable when executed by the one or more processors to generate at least a portion of the profile after the videoconference has terminated.
  - 17. The non-transitory computer readable memory of claim 7, wherein the profile comprises a location of the participant.
  - 18. The non-transitory computer readable memory of claim 7, wherein the profile comprises an address of the participant.

13. A method for generating a transcription of a videoconference, comprising:
- matching human speech of a videoconference to writable symbols, the human speech encoded in an audio data stream of the videoconference;
  
  determining a probability that a portion of the human speech matches a voice profile of a participant of a plurality of participants of the videoconference, the voice profile stored in tangible computer-readable memory;
  
  if the probability is less than a predetermined threshold, using video data of the videoconference to determine which participant of the plurality of participants of the videoconference is the most likely source of the portion of the human speech, the video data corresponding to the portion of the human speech; and
  
  generating a transcription of the videoconference that identifies an association of the portion of the human speech and the participant of the plurality of participants determined to be the most likely source of the portion of the human speech.
- View Dependent Claims (14, 19, 20)
- - 14. The method of claim 13, further comprising generating the voice profile using the audio data stream of the videoconference.
  - 19. The method of claim 13, wherein the voice profile comprises a location of the participant.
  - 20. The method of claim 13, wherein the voice profile comprises an address of the participant.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fujitsu Limited
Original Assignee
Fujitsu Limited
Inventors
Marvit, David L.
Primary Examiner(s)
Neway, Samuel G

Application Number

US12/872,268
Publication Number

US 20120053936A1
Time in Patent Office

1,232 Days
Field of Search

348 1408- 1409, 704246-250
US Class Current

704/246
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

G10L 17/08   Use of distortion metrics o...

G10L 17/10   Multimodal systems, i.e. ba...

H04N 7/147   Communication arrangements,...

System and method for generating videoconference transcriptions

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for generating videoconference transcriptions

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links