System and methods for improving accuracy of speech recognition

US 20060074671A1
Filed: 10/05/2004
Published: 04/06/2006
Est. Priority Date: 10/05/2004
Status: Active Grant

First Claim

Patent Images

1. A speech recognition system for providing a textual output from an audible signal representative of spoken words, said system comprising:

a natural language processor for parsing a partially recognized sentence into a sentence type and an associated ordered list of recognized words and unrecognized sound groupings, said sentence type having an ordered list of concepts, said partially recognized sentence corresponding to the audible signal;

a grammar rule generator for expanding each of said concepts at a location corresponding to one of said unrecognized sound groupings into a plurality of related words;

a speech recognition engine for converting the audible signal to the textual output, said speech recognition engine being operatively connected to said plurality of related words for resolving the one of said unrecognized sound grouping.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention provides a system and method for improving speech recognition. A computer software system is provided for implementing the system and method. A user of the computer software system may speak to the system directly and the system may respond, in spoken language, with an appropriate response. Grammar rules may be generated automatically from sample utterances when implementing the system for a particular application. Dynamic grammar rules may also be generated during interaction between the user and the system. In addition to arranging searching order of grammar files based on a predetermined hierarchy, a dynamically generated searching order based on history of contexts of a single conversation may be provided for further improved speech recognition. Dialogue between the system and the user of the system may be recorded and extracted for use by a speech recognition engine to refine or create language models so that accuracy of speech recognition relevant to a particular knowledge area may be improved.

Citations

35 Claims

1. A speech recognition system for providing a textual output from an audible signal representative of spoken words, said system comprising:
- a natural language processor for parsing a partially recognized sentence into a sentence type and an associated ordered list of recognized words and unrecognized sound groupings, said sentence type having an ordered list of concepts, said partially recognized sentence corresponding to the audible signal;
  
  a grammar rule generator for expanding each of said concepts at a location corresponding to one of said unrecognized sound groupings into a plurality of related words;
  
  a speech recognition engine for converting the audible signal to the textual output, said speech recognition engine being operatively connected to said plurality of related words for resolving the one of said unrecognized sound grouping.

2. A system for improving recognition accuracy of an audible signal representative of spoken words, the audible signal being converted to a textual output by a speech recognition engine, said system comprising:
- a natural language processor for parsing a sentence in a textual format into an ordered list of keywords;
  
  a grammar rule generator for expanding each keyword of said ordered list into a plurality of related words to obtain a grammar rule from said ordered list of keywords;
  
  wherein said speech recognition engine is operatively connected to said grammar rule for resolving unrecognized sound groupings in the audible signal into the corresponding spoken words in the textual output.
- View Dependent Claims (3, 4, 5)
- - 3. The system of claim 2, further comprising an editor for preparing concept to keywords mappings, wherein said expansion of each keyword into said plurality of related words corresponds to matching each said keyword to a concept and replacing said concept with keywords using a corresponding concept to keywords mapping.
  - 4. The system of claim 2, wherein said grammar rule has a context designation assigned thereto.
  - 5. The system of claim 3, wherein said system is operable to determine a conversation context of the spoken words, and said speech recognition engine is operable to select said grammar rule if said context designation matches said conversation context.

6. A method of generating a grammar rule for use by a speech recognition engine, said method comprising the steps of:
- parsing a sample sentence using a natural language processor into an ordered list of keywords;
  
  matching each keyword of said ordered list to a concept using a concept to keywords mapping; and
  
  producing the grammar rule from said ordered list by replacing each said concept with a list of keywords using the concept to keywords mapping.
- View Dependent Claims (7, 8, 9)
- - 7. The method according to claim 6, further comprising the step of assigning a context designation to said grammar rule.
  - 8. The method according to claim 6, wherein said concept to keywords mapping has a context attribute and the context designation assigned to said grammar rule corresponds to said context attribute.
  - 9. The method according to claim 6, further comprising the step of preparing a plurality of concept to keywords mappings.

10. A speech recognition method for resolving unrecognized sound groups in a partially recognized speech using concept to keywords mappings and sentence types, each sentence type having a plurality of associated grammar rules, the grammar rules being expressed in concepts, said method comprising the steps of:
- parsing the partially recognized speech using a natural language processor into a pre-determined sentence type and an associated ordered list of recognized words and the unrecognized sound groups;
  
  selecting a list of grammar rules associated with the sentence type from a plurality of grammar rules, each grammar rule of said list having a plurality of constituent concepts, each of said constituent concepts corresponding to one of the recognized words and the unrecognized sound groups;
  
  for each said unrecognized sound group, merging said corresponding constituent concepts in all said selected grammar rules into a list of concepts;
  
  expanding said list of merged concepts using the concept to keywords mappings to produce a list of candidate words; and
  
  resolving each said unrecognized sound group using the list of candidate words.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The speech recognition method of claim 10, further comprising the step of preparing a plurality of concept to keywords mappings prior to the step of expansion.
  - 12. The speech recognition method of claim 10, wherein the step of selecting said list of grammar rules includes the steps of comparing the partially recognized speech with each of the plurality of grammar rules and discarding any grammar rules that do not match the partially recognized speech.
  - 13. The speech recognition method of claim 12, wherein the step of comparing includes comparing sentence types and the step of discarding includes discarding grammar rules that do not have the same sentence type as the partially recognized speech.
  - 14. The speech recognition method of claim 12, wherein the step of comparing includes comparing the partially recognized speech with said constituent concepts of each of the plurality of grammar rules and the step of discarding includes discarding grammar rules that do not match any recognized words in the partially recognized speech.
  - 15. The speech recognition method of claim 12, further comprising the step of determining a conversation context of the partially recognized speech, wherein each of said selected grammar rules further has a context designation and the step of comparing including comparing the context designation with the conversation context and the step of discarding includes discarding grammar rules that do not have the conversation context matching the context designation.

16. A method for generating a dynamic grammar file for use by a speech recognition engine to resolve unrecognized sound groups in a speech using concept to keywords mappings and sentence types, each sentence type having a plurality of associated grammar rules, the grammar rules being expressed in concepts, said method comprising the steps of:
- parsing the partially recognized speech using a natural language processor into a pre-determined sentence type and an associated ordered list of recognized words and the unrecognized sound groups;
  
  selecting a list of grammar rules associated with the sentence type from a plurality of grammar rules, each grammar rule of said list having a plurality of constituent concepts, each of said constituent concepts corresponding to one of the recognized words and the unrecognized sound groups;
  
  for each said unrecognized sound group, merging said corresponding constituent concepts in all said selected grammar rules into a list of concepts; and
  
  generating the dynamic grammar rule from said ordered list by replacing each concept of said list of merged concepts with a list of keywords using the concept to keywords mappings.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
- - 17. The method according to claim 16, further comprising the step of assigning a context designation to said dynamic grammar rule.
  - 18. The method according to claim 16, wherein said concept to keywords mapping has a context attribute and the context designation assigned to said dynamic grammar rule corresponds to said context attribute.
  - 19. The method according to claim 16, further comprising the step of preparing a plurality of concept to keywords mappings.
  - 20. The method according to claim 16, wherein the step of selecting said list of grammar rules includes the steps of comparing the partially recognized speech with each of the plurality of grammar rules and discarding any grammar rules that do not match the partially recognized speech.
  - 21. The method according to claim 16, wherein the step of comparing includes comparing sentence types and the step of discarding includes discarding grammar rules that do not have the same sentence type as the partially recognized speech.
  - 22. The method according to claim 16, wherein the step of comparing includes comparing the partially recognized speech with discarding includes discarding grammar rules that do not match any recognized words in the partially recognized speech.
  - 23. The method according to claim 16, further comprising the step of determining a conversation context of the partially recognized speech, wherein each of said selected grammar rules further has a context designation and the step of comparing including comparing the context designation with the conversation context and the step of discarding includes discarding grammar rules that do not have the conversation context matching the context designation.

24. A method of speech recognition, said method comprising the steps of:
- preparing a plurality of grammar rules, each of said plurality of grammar rules having a context designation assigned thereto;
  
  determining a conversation context of a speech being recognized by a speech recognition engine and recording said conversation context in a context history;
  
  if said conversation context corresponds to one of said context designations, assigning a ranking order to said context designation in a search sequence as a function of said context history; and
  
  directing said speech recognition engine to search said plurality of grammar rules following said search sequence.
- View Dependent Claims (25, 26, 27)
- - 25. The method according to claim 24, wherein said ranking order correlates to how recent said conversation context appears in said context history.
  - 26. The method according to claim 24, wherein said ranking order correlates to how frequent said conversation context appears in said context history.
  - 27. The method according to claim 24, wherein said ranking order correlates to total length of time said conversation context represents in said context history.

28. A method of compiling a corpus for use by a language model generator, said method comprising the steps of:
- storing text of user input from a user and response to said user input generated by a knowledge base system in a log file;
  
  extracting a thread of conversation between said user and said knowledge base system, said thread of conversation containing literal texts of said user input and said system response; and
  
  adding said thread of conversation to said corpus.
- View Dependent Claims (29, 30, 31)
- - 29. The method according to claim 28, further comprising the step of:
    - recognizing said user input as a speech using a speech recognition engine, wherein the step of storing includes storing text of the recognized speech of said user.
  - 30. The method according to claim 28, wherein said system response is extracted from a database of pre-preprogrammed responses.
  - 31. The method according to claim 28, further comprising the step of:
    - preparing a plurality of pre-programmed responses; and
      
      adding all said pre-programmed responses to said corpus.

32. A method for improving recognition accuracy of a speech recognition system, the speech recognition system having a speech recognition engine for converting audible signal representative of spoken words into a textual output, the method comprising the steps of:
- selecting a first plurality of concepts;
  
  preparing a second plurality of concept to keywords mappings, each concept of said first plurality of concepts having at least one concept to keywords mapping;
  
  defining a third plurality of sentence types, each sentence type being associated with an ordered list of concepts, said ordered list of concepts being formed from said first plurality of concepts;
  
  providing said first plurality of concepts, said second plurality of concept to keywords mappings and said third plurality of sentence types, together with said associated ordered lists of concepts, to the speech recognition system for resolving unrecognized sound groupings in the audible signal.
- View Dependent Claims (33, 34, 35)
- - 33. The method of claim 32, further comprising the steps of:
    - entering a sample utterance;
      
      parsing said sample utterance into a sentence type and an associated ordered list of concepts using a natural language processor;
      
      generating a grammar rule from said sentence type and said associated ordered list of concepts using a grammar rule generator; and
      
      providing said grammar rule to the speech recognition engine to resolve unrecognized sound groupings in the audible signal.
  - 34. The method of claim 32, further comprising the steps of:
    - entering a plurality of sample utterances;
      
      parsing each of said sample utterances into a sentence type and an associated second ordered list of concepts using a natural language processor;
      
      generating a grammar rule from said sentence type and said associated second ordered list of concepts using a grammar rule generator; and
      
      providing said plurality of grammar rules to the speech recognition engine to resolve unrecognized sound groupings in the audible signal.
  - 35. The method of claim 34, further comprising the steps of:
    - providing a text input corresponding to a partially recognized audible signal to a natural language processor;
      
      parsing said text input into a second sentence type and an associated ordered list of recognized words and unrecognized sound groupings using the natural language processor;
      
      selecting a list of grammar rules associated with the second sentence type from said plurality of grammar rules, each grammar rule of said list having a plurality of constituent concepts;
      
      expanding each of said constituent concepts at a location corresponding to one of said unrecognized sound groupings into a plurality of related words; and
      
      providing said plurality of related words to the speech recognition engine to resolve the one of said unrecognized sound groupings.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
iNAGO Corporation
Original Assignee
iNAGO Corporation
Inventors
Leonard, Huw, Dicarlantonio, Ron, Farmaner, Gary

Granted Patent

US 7,925,506 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/257
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/193 Formal grammars, e.g. finit...

System and methods for improving accuracy of speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

35 Claims

Specification

Solutions

Use Cases

Quick Links

System and methods for improving accuracy of speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

35 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links