System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy

US 20070233488A1
Filed: 03/29/2006
Published: 10/04/2007
Est. Priority Date: 03/29/2006
Status: Active Grant

First Claim

Patent Images

1. A method for loading and unloading dynamically constructed and identified language model or grammar data in an automatic speech recognition system having a structured report organization, the method comprising the steps of:

determining sections used for the structured data input;

determining content within said sections for the structured data input;

based on said content, creating a recognition language model data;

determining a section status for said structured section input;

based on said section status, loading a corresponding recognition language model or grammar data into the automatic speech recognition system, and conducting speech recognition of the structured data input using said corresponding recognition language model or grammar data.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.

Citations

19 Claims

1. A method for loading and unloading dynamically constructed and identified language model or grammar data in an automatic speech recognition system having a structured report organization, the method comprising the steps of:
- determining sections used for the structured data input;
  
  determining content within said sections for the structured data input;
  
  based on said content, creating a recognition language model data;
  
  determining a section status for said structured section input;
  
  based on said section status, loading a corresponding recognition language model or grammar data into the automatic speech recognition system, and conducting speech recognition of the structured data input using said corresponding recognition language model or grammar data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The method according to claim 1 wherein determining a section status further includes identifying text document sections from the structured report organization.
  - 3. The method according to claim 2 further comprising collecting the text from the identified document sections.
  - 4. The method according to claim 1 further comprising assembling sections of a document in said speech recognized structured data input.
  - 5. The method according to claim 3 further comprising determining automatic section headings.
  - 6. The method according to claim 5 further comprising combining the collected text from the identified document sections and the determined automatic section headings.
  - 7. The method according to claim 6 further comprising conducting training of section language models and section grammars based on the combined text from the identified document sections and the determined automatic section headings.
  - 8. The method according to claim 7 further comprising conducting speech recognition based on the combined text from the identified document sections and the determined automatic section headings.
  - 9. The method according to claim 8 further comprising assembling training data.
  - 10. The method according to claim 9, based upon the assembled training data, creating either a smoothed section language model, an unsmoothed section language model or section grammars list.
  - 11. The method according to claim 10, for a created smoothed section language model, conducting speech recognition with said smoothed section language model.
  - 12. The method according to claim 9, for a created unsmoothed section language model or a created section grammars list, conducting speech recognition with said unsmoothed section language model.
  - 13. The method according to claim 12 further comprising the step of conducting a confidence level evaluation.
  - 14. The method according to claim 13, where the confidence level evaluation meets a pre-determined threshold value, assembling the identified documents sections and determined automatic section headings into at least one finished document.
  - 15. The method according to claim 13, where the confidence level evaluation does not meet a predetermined threshold value, inputting a generic language model.
  - 16. The method according to claim 15 where the generic language model may be derived from a factory, site or user specific language model.
  - 17. The method according to claim 16 further comprising the step of conducting speech recognition with said generic language model.
  - 18. The method according to claim 17 further comprising the step of comparing speech recognition results from the generated section language model or section grammar list and speech recognition results from the generic language model.
  - 19. The method according to claim 18 further comprising the step of, base upon said comparison, assembling the identified documents sections and determined automatic section headings into at least one finished document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Dictaphone Corporation (Microsoft Corporation)
Inventors
Lapshina, Larissa, Carus, Alwin, Vemula, Raghu

Granted Patent

US 8,301,448 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/257
CPC Class Codes

G10L 15/183 using context dependencies,...

System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links