SYSTEM AND METHOD FOR OPTIMIZING SPEECH RECOGNITION AND NATURAL LANGUAGE PARAMETERS WITH USER FEEDBACK

US 20120290298A1
Filed: 05/09/2011
Published: 11/15/2012
Est. Priority Date: 05/09/2011
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving from a sender, via a processor, a speech document;

capturing a context of the speech document;

weighting an automatic speech recognition model based at least in part on the context of the speech document, yielding a weighted automatic speech recognition model;

converting the speech document to text using the weighted automatic speech recognition model, yielding a transcript;

receiving from a user a judgment of perceived accuracy of the transcript;

updating the weighted automatic speech recognition model based on the judgment.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Citations

20 Claims

1. A method comprising:
- receiving from a sender, via a processor, a speech document;
  
  capturing a context of the speech document;
  
  weighting an automatic speech recognition model based at least in part on the context of the speech document, yielding a weighted automatic speech recognition model;
  
  converting the speech document to text using the weighted automatic speech recognition model, yielding a transcript;
  
  receiving from a user a judgment of perceived accuracy of the transcript;
  
  updating the weighted automatic speech recognition model based on the judgment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the context of the speech document includes at least one of a name of the sender, a location of the sender, a time sent, and a subject.
  - 3. The method of claim 1, wherein weighting the automatic speech model is further based at least in part on a user profile.
  - 4. The method of claim 3, wherein the user profile includes at least one of a previous communication history and a list of contexts.
  - 5. The method of claim 4, wherein each context in the list of contexts includes an importance ranking.
  - 6. The method of claim 3, wherein the transcript receives a score based at least in part on predicted errors in conversion, the user profile, and the context of the speech document.
  - 7. The method of claim 1, wherein the weighted automatic speech recognition model assign a saliency weight to a set of at least one word, based at least in part on a frequency of the set of at least one word, the context of the speech document, and a user profile, yielding a weighted set.
  - 8. The method of claim 7, wherein a high saliency weight indicates high predicted importance to the user.
  - 9. The method of claim 7, wherein converting the speech document to text is based on the saliency weight of at least a portion of the text.
  - 10. The method of claim 9, wherein the weighted automatic speech recognition model directs the processor to spend more effort converting to text high saliency text.

11. A system, comprising:
- a processor;
  
  a first module configured to control the processor to receive, from a sender, a speech document;
  
  a second module configured to control the processor to capture a context of the speech document;
  
  a third module configured to filter an automatic speech recognition model based at least in part on the context of the speech document and word frequency, yielding a filtered automatic speech recognition model;
  
  a fourth module configured to convert the speech document to text applying the filtered automatic speech recognition model, yielding a transcript;
  
  a fifth module configured to receive from a user a judgment of perceived accuracy of the transcript;
  
  a sixth module configured to update the filtered automatic speech recognition model based on the judgment.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The system of claim 11, wherein the filtered automatic speech recognition model assigns a saliency weight to a set of at least one word, based at least in part on a frequency of the set of at least one word, the context of the speech document, and a user profile.
  - 13. The system of claim 12, wherein a high frequency of the set of at least one word yields a low saliency weight and a low frequency of the set of at least one word yields a high saliency weight.
  - 14. The system of claim 11, wherein the third module configured to filter an automatic speech recognition model further is further based at least in part on a likelihood that a set of at least one word was erroneously recognized.
  - 15. The system of claim 14, wherein the likelihood that a set of at least one word was erroneously recognized is determined based on at least one of a word insertion error rate, a word deletion error rate, and a word substitution error rate.
  - 16. The system of claim 11, wherein the fifth module receives the judgment through means including at least one of a keyboard, a vocal response, a pointing device, and a touch screen.

17. A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to improve an automatic speech recognition model, the instructions comprising:
- receiving from a sender a speech document;
  
  capturing, via a processor, a context of a speech document;
  
  weighting the automatic speech recognition model based at least in part on the context of the speech document, yielding a weighted automatic speech recognition model;
  
  converting the speech document to text using the weighted automatic speech recognition model, yielding a transcript;
  
  receiving from a user a judgment of perceived accuracy of the transcript;
  
  updating the weighted automatic speech recognition model based on the judgment.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer-readable storage medium of claim 17, wherein weighting the automatic speech model is further based at least in part on a user profile, the frequency within the speech document of a set of at least one word, and a geographical location associated with the speech document.
  - 19. The non-transitory computer-readable storage medium of claim 17, the instructions further comprising:
    - storing the judgment in a database, yielding stored judgments;
      
      updating the automatic speech recognition model based at least in part on the stored judgments.
  - 20. The non-transitory computer-readable storage medium of claim 19, the instructions further comprising:
    - providing the stored judgments to a manufacturer of the automatic speech recognition model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
LJOLJE, Andrej, Caseiro, Diamantino Antonio, Gilbert, Mazin, Goffin, Vincent, Mishra, Taniya

Granted Patent

US 8,738,375 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 15/063   Training

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

SYSTEM AND METHOD FOR OPTIMIZING SPEECH RECOGNITION AND NATURAL LANGUAGE PARAMETERS WITH USER FEEDBACK

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR OPTIMIZING SPEECH RECOGNITION AND NATURAL LANGUAGE PARAMETERS WITH USER FEEDBACK

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links