Method for context driven speech recognition and processing

US 9,460,721 B2
Filed: 01/16/2015
Issued: 10/04/2016
Est. Priority Date: 01/16/2015
Status: Expired due to Fees

First Claim

Patent Images

1. A method for electronically recognizing and processing speech comprising:

creating a first set of grammar rules;

loading the first set of grammar rules into a speech recognizer;

receiving a first transmitted audio stream containing a first utterance of speech to be recognized;

running a language script in the speech recognizer;

comparing language in the first transmitted audio stream to language in the first set of grammar rules to determine whether the language in the first transmitted audio stream matches language in the first set of grammar rules;

producing a textual representation of the language of the first transmitted audio stream, using language of one of the grammar rules of the first set of grammar rules, to create consumable data when a match between the language of the first transmitted audio stream and the language of one of the first set of grammar rules is found;

transmitting the consumable data to a processor;

determining which grammar rule has language that most likely matches the language of the first transmitted audio stream, when multiple possible matches are found;

producing a textual representation of the language of the first transmitted audio stream using language of a best matched grammar rule of the first set of grammar rules to create consumable data;

transmitting the consumable data to the processor;

creating a subsequent set of grammar rules when no match is found between the language of the first transmitted audio stream and the language of the first set of grammar rules;

repeating the loading, comparing and determining steps with language of the subsequent set of grammar rules and language of the first transmitted audio stream until a match is found;

producing a textual representation of the language of the first transmitted audio stream using language of a best matched grammar rule of the subsequent set of grammar rules, to create consumable data;

transmitting the consumable data to the processor; and

creating a separate set of grammar rules to recognize and process a second transmitted audio stream containing a second utterance separate and distinct from the first utterance in the first audio stream where the separate set of grammar rules is based on the consumable data transmitted from the first transmitted audio stream and repeating the recognizing and processing speech steps, whereby, results of a current recognition event impacts future recognition events by producing consumable data from the loaded set of grammar rules to determine the next appropriate set of rules.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention is system and method to recognize speech vocalizations using context-specific grammars and vocabularies. The system and method allow increased accuracy of recognized utterances by eliminating all language encodings irrelevant to the current context and allowing identification of appropriate context transitions. The system and method creates a context dependent speech recognition system with multiple supported contexts, each with specific grammar and vocabulary, and each identifying the potential context transition allowed. The system and method also include programmatic integration between the context dependent speech recognition system and other systems to make use of the recognized speech.

8 Citations

9 Claims

1. A method for electronically recognizing and processing speech comprising:
- creating a first set of grammar rules;
  
  loading the first set of grammar rules into a speech recognizer;
  
  receiving a first transmitted audio stream containing a first utterance of speech to be recognized;
  
  running a language script in the speech recognizer;
  
  comparing language in the first transmitted audio stream to language in the first set of grammar rules to determine whether the language in the first transmitted audio stream matches language in the first set of grammar rules;
  
  producing a textual representation of the language of the first transmitted audio stream, using language of one of the grammar rules of the first set of grammar rules, to create consumable data when a match between the language of the first transmitted audio stream and the language of one of the first set of grammar rules is found;
  
  transmitting the consumable data to a processor;
  
  determining which grammar rule has language that most likely matches the language of the first transmitted audio stream, when multiple possible matches are found;
  
  producing a textual representation of the language of the first transmitted audio stream using language of a best matched grammar rule of the first set of grammar rules to create consumable data;
  
  transmitting the consumable data to the processor;
  
  creating a subsequent set of grammar rules when no match is found between the language of the first transmitted audio stream and the language of the first set of grammar rules;
  
  repeating the loading, comparing and determining steps with language of the subsequent set of grammar rules and language of the first transmitted audio stream until a match is found;
  
  producing a textual representation of the language of the first transmitted audio stream using language of a best matched grammar rule of the subsequent set of grammar rules, to create consumable data;
  
  transmitting the consumable data to the processor; and
  
  creating a separate set of grammar rules to recognize and process a second transmitted audio stream containing a second utterance separate and distinct from the first utterance in the first audio stream where the separate set of grammar rules is based on the consumable data transmitted from the first transmitted audio stream and repeating the recognizing and processing speech steps, whereby, results of a current recognition event impacts future recognition events by producing consumable data from the loaded set of grammar rules to determine the next appropriate set of rules.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein choosing the first grammar rule is based a prior knowledge of language most likely to be used in the first transmitted audio stream.
  - 3. The method of claim 1, wherein the grammar rules define words that are expected to be contained in vocal utterances.
  - 4. The method of claim 1, wherein the transmitted audio streams are spoken language.
  - 5. The method of claim 1, wherein creating a subsequent set of grammar rules is based on audio stream language most likely to be used next based on a context of previously transmitted audio streams.
  - 6. The method of claim 1, wherein the grammar rules are written using a grammar specification language such as grXML or the like.
  - 7. The method of claim 1, wherein the first and subsequent sets of grammar rules are subsets of an entire grammar.
  - 8. The method of claim 1, wherein size and number the first and subsequent sets of grammar rules size and number are limited by context of language in the transmitted audio streams.

9. A system for context driven speech recognition comprising:
- a processor;
  
  a user interface electronically connected to the processor;
  
  a voice input device electronically connected to the user interface;
  
  a speech recognizer electronically connected too the processor;
  
  a memory electronically connected the processor;
  
  an application framework stored in the memory;
  
  a logic function stored in the memory; and
  
  a configuration, interface stored the memory, whereina first set of grammar rules, stored in the memory, is loaded into the speech recognizer;
  
  the user interface receives a first transmitted audio stream containing a first utterance of speech to be recognized;
  
  the speech recognizer runs a language script to compare language in the first transmitted audio stream to language in the first set of grammar rules to determine whether the language in the first transmitted audio stream matches language in the first set of grammar rules;
  
  the logic function produces a textual representation of the language of the first transmitted audio stream, using language of one of the grammar rules of the first set of grammar rules, to create consumable data when a match between the language of the first transmitted audio stream and the language of one of the first set of grammar rules is found, and transmits the consumable data to the processor;
  
  the processor determines which grammar rule of the first set of grammar rules has language that best matches the language of the first transmitted audio stream, when multiple matches are found;
  
  the application framework produces a textual representation of the language of the first transmitted audio stream using language of the best matched grammar rule of the first set of grammar rules to create consumable data and transmits the consumable data to the processor;
  
  the logic function creates a subsequent set of grammar rules when no match is found between the language of the first transmitted audio stream and the language of the first set of grammar rules to recognize and process a second transmitted audio stream containing a second utterance separate and distinct from the first utterance in the first audio stream where the separate set of grammar rules is based on the consumable data transmitted from the first transmitted audio stream and repeating the recognizing and processing speech steps, whereby, results of a current recognition event impacts future recognition events by producing consumable data from the loaded set of grammar rules to determine the next appropriate set of rules.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
US Department of The Navy (U.S. Department Of Defense)
Original Assignee
the united states of america as represented by the secretary of the navy
Inventors
Ouakil, Lisa, Ouakil, Abdelhamid, Mouri, Ouns, Smith, Peter
Primary Examiner(s)
Riley, Marcus T

Application Number

US14/598,958
Publication Number

US 20160210968A1
Time in Patent Office

627 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 40/253   Grammatical analysis; Style...

G06F 40/279   Recognition of textual enti...

G06F 40/30   Semantic analysis

G10L 15/183   using context dependencies,...

G10L 2015/228   of application context

Method for context driven speech recognition and processing

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

8 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Method for context driven speech recognition and processing

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

8 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links