SAMPLE CLUSTERING TO REDUCE MANUAL TRANSCRIPTIONS IN SPEECH RECOGNITION SYSTEM

US 20120158399A1
Filed: 12/21/2010
Published: 06/21/2012
Est. Priority Date: 12/21/2010
Status: Active Grant

First Claim

Patent Images

1. A method of processing a plurality of training samples for an automatic speech recognition (ASR) application, the method comprising acts of:

forming at least one cluster from the plurality of training samples, the at least one cluster including a number of the plurality of training samples, wherein the number equals two or more;

selecting at least one training sample from the at least one cluster;

obtaining at least one manually-processed data sample resulting from manual processing of the selected at least one training sample in the at least one cluster; and

assigning, to the at least one manually-processed data sample, a weighting factor based, at least in part, on the number of training samples in the cluster associated with the selected at least one manually-processed data sample.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for grouping a plurality of samples automatically transcribed from a plurality of utterances. The method comprises forming clusters from the plurality of samples, wherein the clusters include two or more of the plurality of samples. One or more samples are selected from a cluster and manually-processed data samples for the one or more samples are obtained. A weighting factor may be assigned to the data samples based, at least in part, on the number of samples in the cluster associated with the selected data sample.

194 Citations

26 Claims

1. A method of processing a plurality of training samples for an automatic speech recognition (ASR) application, the method comprising acts of:
- forming at least one cluster from the plurality of training samples, the at least one cluster including a number of the plurality of training samples, wherein the number equals two or more;
  
  selecting at least one training sample from the at least one cluster;
  
  obtaining at least one manually-processed data sample resulting from manual processing of the selected at least one training sample in the at least one cluster; and
  
  assigning, to the at least one manually-processed data sample, a weighting factor based, at least in part, on the number of training samples in the cluster associated with the selected at least one manually-processed data sample.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method of claim 1, wherein selecting at least one training sample comprises selecting a single training sample from the at least one cluster.
  - 3. The method of claim 1, wherein selecting at least one training sample comprises selecting a plurality of training samples from the at least one cluster.
  - 4. The method of claim 3, further comprising:
    - determining a number of training samples to select from the at least one cluster based, at least in part, on the number of training samples in the at least one cluster.
  - 5. The method of claim 1, wherein selecting at least one training sample from the at least one cluster comprises selecting at least m training samples from the at least one cluster having N training samples in the cluster, wherein m is determined according to the formula:
    - m=log(N).
  - 6. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - forming the at least one cluster based, at least in part on a similarity of at least two training samples.
  - 7. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - identifying identical training samples in the plurality of training samples; and
      
      forming the at least one cluster to include the identified identical training samples.
  - 8. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - removing at least one non-content word from at least one of the plurality of training samples prior to forming the at least one cluster.
  - 9. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - identifying, based, at least in part, on stored data indicating at least one mapping sequence, a simple synonym for a word and/or phrase in at least one of the plurality of transcriptions;
      
      replacing the identified word and/or phrase with the simple synonym in accordance with the stored at least one mapping sequence; and
      
      grouping together in the at least one cluster, training samples that match after replacement of the word and/or phrase with the simple synonym.
  - 10. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - associating saliency values with words and/or phrases in the plurality of training samples; and
      
      forming the at least one cluster based, at least in part, on the saliency values associated with the words and/or phrases in the plurality of training samples.
  - 11. The method of claim 1, wherein the act of forming at least one cluster comprises:
    - forming a plurality of clusters by processing the plurality of training samples.
  - 12. The method of claim 1, wherein the at least one manually-processed data sample includes a manual annotation for the at least one manually-processed data sample.
  - 13. The method of claim 1, further comprising:
    - generating a training corpus, wherein the training corpus includes the at least one manually-processed data sample and its assigned weighting factor.
  - 14. The method of claim 13, further comprising:
    - training at least one statistical language model using the training corpus.
  - 15. The method of claim 13, further comprising:
    - training at least one call-routing application using the training corpus.

16. At least one non-transitory computer readable storage medium encoded with a plurality of instructions that, when executed by a computer, perform a method of processing a plurality of training samples for an automatic speech recognition (ASR) application, the method comprising acts of:
- forming at least one cluster from the plurality of training samples, the at least one cluster including a number of the plurality of training samples, wherein the number equals two or more;
  
  selecting at least one training sample from the at least one cluster;
  
  obtaining at least one manually-processed data sample resulting from manual processing of the selected at least one training sample in the at least one cluster; and
  
  assigning, to the at least one manually-processed data sample, a weighting factor based, at least in part, on the number of training samples in the cluster associated with the selected at least one manually-processed data sample.
- View Dependent Claims (17)
- - 17. The at least one non-transitory computer readable storage medium of claim 16, wherein the method further comprises:
    - determining a number of training samples to select from the at least one cluster based, at least in part, on the number of training samples in the at least one cluster.

18. A computer system, comprising:
- at least one storage device configured to store a plurality of instructions; and
  
  at least one processor programmed to execute the plurality of instructions to perform a method comprising acts of;
  
  forming at least one cluster from the plurality of training samples, the at least one cluster including a number of the plurality of training samples, wherein the number equals two or more;
  
  selecting at least one training sample from the at least one cluster;
  
  obtaining at least one manually-processed data sample resulting from manual processing of the selected at least one training sample in the at least one cluster; and
  
  assigning, to the at least one manually-processed data sample, a weighting factor based, at least in part, on the number of training samples in the cluster associated with the selected at least one manually-processed data sample.
- View Dependent Claims (19, 20)
- - 19. The computer system of claim 18, wherein the method further comprises an act of:
    - generating a training corpus, wherein the training corpus includes the at least one manually-processed data sample and its assigned weighting factor.
  - 20. The computer system of claim 19, wherein the method further comprises an act of:
    - implementing at least one call routing application; and
      
      training the at least one call routing application and/or a language model associated with the at least one call routing application with the training corpus.

21. A method for updating a grammar using a plurality of data samples, the method comprising:
- forming, with at least one processor, a cluster including at least two data samples of the plurality of data samples based, at least in part, on a similarity between the at least two data samples;
  
  selecting at least one data sample from the cluster;
  
  determining whether the at least one data sample is covered by the grammar; and
  
  updating the grammar based, at least in part, on the at least one data sample, when it is determined that the at least one data sample is not covered by the grammar.
- View Dependent Claims (22, 23, 24, 25, 26)
- - 22. The method of claim 21, further comprising:
    - receiving a manual transcription of the at least one data sample; and
      
      updating the grammar based, at least in part on the manual transcription.
  - 23. The method of claim 22, further comprising:
    - identifying at least one substring in the manual transcription; and
      
      updating the grammar based, at least in part, on the at least one substring.
  - 24. The method of claim 21, wherein the plurality of data samples are transcriptions of audio samples.
  - 25. The method of claim 21, further comprising:
    - transcribing a plurality of training samples to produce the plurality of data samples.
  - 26. The method of claim 21, wherein the method further comprises:
    - forming a plurality of clusters by processing the plurality of data samples.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Tremblay, Real, Tremblay, Jerome, Andreevskaia, Alina

Granted Patent

US 8,666,726 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/9
CPC Class Codes

G10L 15/063 Training

G10L 2015/0631 Creating reference template...

SAMPLE CLUSTERING TO REDUCE MANUAL TRANSCRIPTIONS IN SPEECH RECOGNITION SYSTEM

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

194 Citations

26 Claims

Specification

Use Cases

Quick Links

Others

SAMPLE CLUSTERING TO REDUCE MANUAL TRANSCRIPTIONS IN SPEECH RECOGNITION SYSTEM

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

194 Citations

26 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others