System and method for optimizing speech recognition and natural language parameters with user feedback

US 9,396,725 B2
Filed: 05/27/2014
Issued: 07/19/2016
Est. Priority Date: 05/09/2011
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

weighting a first automatic speech recognition model, to yield a weighted first automatic speech recognition model;

weighting a second automatic speech recognition model, to yield a weighted second automatic speech recognition model;

converting, via a processor, a speech document to text using the weighted first automatic speech recognition model, to yield a first transcript;

converting, via the processor, the speech document to text using the weighted second automatic speech recognition model, to yield a second transcript;

receiving, from a user, a judgment of perceived accuracy of the first transcript and the second transcript; and

updating, via the processor, the weighted first automatic speech recognition model and the weighted second automatic speech recognition model based on the judgment.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Citations

20 Claims

1. A method comprising:
- weighting a first automatic speech recognition model, to yield a weighted first automatic speech recognition model;
  
  weighting a second automatic speech recognition model, to yield a weighted second automatic speech recognition model;
  
  converting, via a processor, a speech document to text using the weighted first automatic speech recognition model, to yield a first transcript;
  
  converting, via the processor, the speech document to text using the weighted second automatic speech recognition model, to yield a second transcript;
  
  receiving, from a user, a judgment of perceived accuracy of the first transcript and the second transcript; and
  
  updating, via the processor, the weighted first automatic speech recognition model and the weighted second automatic speech recognition model based on the judgment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the weighting of the first automatic speech recognition model and the weighting of the second automatic speech recognition model is based on a context of the speech document.
  - 3. The method of claim 2, wherein the context of the speech document comprises one of a name of an originator of the speech document.
  - 4. The method of claim 1, wherein the weighting of the first automatic speech recognition model and the weighting of the second automatic speech recognition model is based on a user profile.
  - 5. The method of claim 4, wherein the user profile comprises a list of contexts.
  - 6. The method of claim 4, wherein the user profile comprises a previous communication history.
  - 7. The method of claim 1, wherein the weighted first automatic speech recognition model and the weighted second automatic speech recognition model each contain saliency weights to words in the speech document.
  - 8. The method of claim 7, wherein a high saliency weight indicates a high predicted importance to the user.
  - 9. The method of claim 8, wherein the processor spends additional effort converting high saliency text.

10. A system comprising:
- a processor; and
  
  a computer-readable storage device having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  weighting a first automatic speech recognition model, to yield a weighted first automatic speech recognition model;
  
  weighting a second automatic speech recognition model, to yield a weighted second automatic speech recognition model;
  
  converting a speech document to text using the weighted first automatic speech recognition model, to yield a first transcript;
  
  converting the speech document to text using the weighted second automatic speech recognition model, to yield a second transcript;
  
  receiving, from a user, a judgment of perceived accuracy of the first transcript and the second transcript; and
  
  updating the weighted first automatic speech recognition model and the weighted second automatic speech recognition model based on the judgment.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the weighting of the first automatic speech recognition model and the weighting of the second automatic speech recognition model is based on a context of the speech document.
  - 12. The system of claim 11, wherein the context of the speech document comprises one of a name of an originator of the speech document.
  - 13. The system of claim 10, wherein the weighting of the first automatic speech recognition model and the weighting of the second automatic speech recognition model is based on a user profile.
  - 14. The system of claim 13, wherein the user profile comprises a list of contexts.
  - 15. The system of claim 13, wherein the user profile comprises a previous communication history.
  - 16. The system of claim 10, wherein the weighted first automatic speech recognition model and the weighted second automatic speech recognition model each contain saliency weights to words in the speech document.
  - 17. The system of claim 16, wherein a high saliency weight indicates a high predicted importance to the user.
  - 18. The system of claim 17, wherein the processor spends additional effort converting high saliency words.

19. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- weighting a first automatic speech recognition model, to yield a weighted first automatic speech recognition model;
  
  weighting a second automatic speech recognition model, to yield a weighted second automatic speech recognition model;
  
  converting a speech document to text using the weighted first automatic speech recognition model, to yield a first transcript;
  
  converting the speech document to text using the weighted second automatic speech recognition model, to yield a second transcript;
  
  receiving, from a user, a judgment of perceived accuracy of the first transcript and the second transcript; and
  
  updating the weighted first automatic speech recognition model and the weighted second automatic speech recognition model based on the judgment.
- View Dependent Claims (20)
- - 20. The computer-readable storage device of claim 19, wherein the weighting of the first automatic speech recognition model and the weighting of the second automatic speech recognition model is based on a context of the speech document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Ljolje, Andrej, Caseiro, Diamantino Antonio, Gilbert, Mazin, Goffin, Vincent, Mishra, Taniya
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
CHAVEZ, RODRIGO A

Application Number

US14/287,866
Publication Number

US 20150348540A1
Time in Patent Office

784 Days
Field of Search

704/231, 704/235, 704/250, 704/251, 704/255, 704/256
US Class Current

1/1
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/063   Training

G10L 15/18   using natural language mode...

G10L 15/26   Speech to text systems G10L...

G10L 2015/0635   updating or merging of old ...

System and method for optimizing speech recognition and natural language parameters with user feedback

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for optimizing speech recognition and natural language parameters with user feedback

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links