Method and apparatus for transcribing speech when a plurality of speakers are participating

US 6,996,526 B2
Filed: 01/02/2002
Issued: 02/07/2006
Est. Priority Date: 01/02/2002
Status: Expired due to Term

First Claim

Patent Images

1. A method for transcribing speech of a plurality of speakers, comprising:

providing said speech to a plurality of speech decoders, each of said decoders using a speaker model corresponding to a different one of said speakers and generating a confidence score for each decoded output;

selecting a decoded output based on said confidence score; and

presenting said decoded output as a string of words for the decoded output having the highest confidence score and as phones or syllables for all other decoded outputs.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus are disclosed for transcribing speech when a number of speakers are participating. A number of different speech recognition systems, each with a different speaker model, are executed in parallel. When the identity of all of the participating speakers is known and a speaker model is available for each participant, each speech recognition system employs a different speaker model suitable for a corresponding participant. Each speech recognition system decodes the speech and generates a corresponding confidence score. The decoded output having the highest confidence score is selected for presentation to a user. When all participating speakers are not known, or when there are too many participants to implement a unique speaker model for each participant, a speaker independent speech recognition system is employed together with a speaker specific speech recognition system. A controller selects between the decoded outputs of the speaker independent speech recognition system and the speaker specific speech recognition system based on information received from a speaker identification system and a speaker change detector.

Citations

9 Claims

1. A method for transcribing speech of a plurality of speakers, comprising:
- providing said speech to a plurality of speech decoders, each of said decoders using a speaker model corresponding to a different one of said speakers and generating a confidence score for each decoded output;
  
  selecting a decoded output based on said confidence score; and
  
  presenting said decoded output as a string of words for the decoded output having the highest confidence score and as phones or syllables for all other decoded outputs.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, further comprising the step of aligning each of said decoded outputs in time.
  - 3. The method of claim 1, wherein one or more of said speech decoders are on a remote server.
  - 4. The method of claim 1, further comprising the step of presenting said selected decoded output to a user.
  - 5. The method of claim 1, further comprising the step of manually selecting an alternate decoded output if said assigned output is incorrect.
  - 6. The method of claim 5, further comprising the step of adapting said selecting step based on said manual selection.
  - 7. The method of claim 1, further comprising the step of presenting several decoded outputs to a user with an indication of said corresponding confidence score.
  - 8. The method of claim 1, further comprising the step of presenting said decoded output as a string of words if said corresponding confidence score exceeds a certain threshold and as a string of phones if said corresponding confidence score is below a certain threshold.
  - 9. The method of claim 1, wherein said selecting step further comprises the step of determining if a decoded output includes an isolated word from a second speaker in a string of words from a first speaker.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Basson, Sara H., Kanevsky, Dimitri, Fairweather, Peter Gustav, Faisman, Alexander, Sorensen, Jeffery Scott
Primary Examiner(s)
Young, W. R.
Assistant Examiner(s)
Vo, Huyen X.

Application Number

US10/040,406
Publication Number

US 20030125940A1
Time in Patent Office

1,497 Days
Field of Search

704255-256, 704/253, 704/257, 704/235, 704243-244, 704/239, 704/251
US Class Current

704/231
CPC Class Codes

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 17/00   Speaker identification or v...

Method and apparatus for transcribing speech when a plurality of speakers are participating

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for transcribing speech when a plurality of speakers are participating

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links