System and method for recognizing speech

US 9,159,317 B2
Filed: 06/14/2013
Issued: 10/13/2015
Est. Priority Date: 06/14/2013
Status: Active Grant

First Claim

Patent Images

1. A method for recognizing speech including a sequence of words, comprising:

generating a set of interpretations of the speech using an acoustic model and a language model;

determining, for each interpretation, a score representing correctness of an interpretation in representing the sequence of words to produce a set of scores;

determining a constraint for recognizing the speech subject to a word sequence constraint;

determining a constraint factor indicating a degree of the consistency with the word sequence constraint;

determining a constrained scoring function based on the constraint factor for updating the set of scores; and

updating the set of scores using the scoring function; and

selecting, according to the updated set of scores, a best interpretation from the set of interpretations as the recognized speech, wherein steps of the method are performed by a processor.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and a method recognize speech including a sequence of words. A set of interpretations of the speech is generated using an acoustic model and a language model, and, for each interpretation, a score representing correctness of an interpretation in representing the sequence of words is determined to produce a set of scores. Next, the set of scores is updated based on a consistency of each interpretation with a constraint determined in response to receiving a word sequence constraint.

15 Citations

View as Search Results

18 Claims

1. A method for recognizing speech including a sequence of words, comprising:
- generating a set of interpretations of the speech using an acoustic model and a language model;
  
  determining, for each interpretation, a score representing correctness of an interpretation in representing the sequence of words to produce a set of scores;
  
  determining a constraint for recognizing the speech subject to a word sequence constraint;
  
  determining a constraint factor indicating a degree of the consistency with the word sequence constraint;
  
  determining a constrained scoring function based on the constraint factor for updating the set of scores; and
  
  updating the set of scores using the scoring function; and
  
  selecting, according to the updated set of scores, a best interpretation from the set of interpretations as the recognized speech, wherein steps of the method are performed by a processor.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the word sequence constraint includes one or combinations of a number of words in the sequence of words, a presence or absence of a specific word or a sequence of words, a time of utterance of the specific word, an order of at least two specific words in the sequence of words, a connection or separation of the two specific words in the sequence of words, a topic of the speech input.
  - 3. The method of claim 1, wherein the determining the constraint comprises:
    - communicating a subset of the set of interpretations to a user;
      
      receiving the word sequence constraint in response to the communicating;
      
      determining a type of the constraint based on the word sequence constraint; and
      
      determining the constraint based on the type.
  - 4. The method of claim 3, wherein the type is a language type and the determining the constraint comprises:
    - updating the language model based on the word sequence constraint.
  - 5. The method of claim 4, wherein the word sequence constraint is a topic of the speech.
  - 6. The method of claim 3, wherein the type is an acoustic type and the determining the constraint comprises:
    - updating the acoustic model based on the word sequence constraint.
  - 7. The method of claim 6, wherein the acoustic model includes an alignment between words in the speech and acoustic features of the acoustic model.
  - 8. The method of claim 7, the constraint includes that there is only one word within a particular time region.
  - 9. The method of claim 3, wherein the type is a context type and the determining the constraint comprises:
    - determining a scoring function testing presence or absence of a specific word in each interpretation.
  - 10. The method of claim 9, wherein the scoring function tests for presence of the specific word, further comprising:
    - determining a direction of the speech based on the language model; and
      
      updating the scoring function with a test for presence of words preceding and following the specific word according to the direction of the speech.
  - 11. The method of claim 1, wherein the scoring function S′
    - (W|X) is
  - 12. The method of claim 11, further comprising:
    - determining an indicator function using the word sequence constraint; and
      
      determining the constraint factor as a linear function of the indicator function with weight parameters that determine a degree of constraint satisfaction.
  - 13. The method of claim 1, wherein the constraint includes metadata of the sequence of words.
  - 14. The method of claim 1, further comprising:
    - determining the interpretation with the largest score as the recognized speech.

15. A method for recognizing speech of a user, comprising:
- recognizing the speech to generate a set of interpretations associated with a corresponding set of scores representing correctness of each interpretation in representing the speech; and
  
  updating iteratively the set of scores subject to at least one constraint, such that, for each iteration, a score of each interpretation is increased if the interpretation is consistent with the constraint, and is decreased if the interpretation is inconsistent with the constraint; and
  
  selecting, according to the updated set of scores, an interpretation from the set of interpretations as the recognized speech, wherein steps of the method are performed by a processor, wherein the updating comprises;
  
  communicating a subset of the set of interpretations to a user;
  
  receiving a word sequence constraint in response to the communicating;
  
  determining a type of the constraint based on the word sequence constraint, wherein the type is a context type;
  
  determining a scoring function testing presence or absence of a specific word in each interpretation;
  
  determining a direction of the speech based on a language model;
  
  updating the scoring function with a test for presence of words preceding and following the specific word according to the direction of the speech; and
  
  determining the constraint based on the type.

16. A system for recognizing speech, comprising:
- a processor implementing a speech recognition module and an error correction module, whereinthe speech recognition module generates a set of interpretations of the speech input using an acoustic model and a language model, determines, for each interpretation, a score representing correctness of an interpretation in representing the speech and selects, according to the score of each interpretation, a best interpretation from the set of interpretation as the recognized speech; and
  
  whereinthe error correction module determines a constraint for recognizing the speech, and updates the score of each interpretation based on a consistency of the interpretation with the constraint, wherein the constraint is determined by;
  
  communicating a subset of the set of interpretations to a user;
  
  receiving a word sequence constraint in response to the communicating;
  
  determining a type of the constraint based on the word sequence constraint, wherein the type is a context type;
  
  determining a scoring function testing presence or absence of a specific word in each interpretation;
  
  determining a direction of the speech based on a language model;
  
  updating the scoring function with a test for presence of words preceding and following the specific word according to the direction of the speech; and
  
  determining the constraint based on the type.
- View Dependent Claims (17, 18)
- - 17. The system of claim 16, further comprising:
    - an audio interface for receiving the speech representing a sequence of words;
      
      a controller for communicating at least a subset of the set of interpretations to the user and for receiving a word sequence constraint from the user, wherein the processor determines the constraint based on the word sequence constraint.
  - 18. The system of claim 16, wherein the system for recognizing the speech is embedded in an instrumental panel of a vehicle.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Electric Research Laboratories, Inc. (Mitsubishi Electric Corporation)
Inventors
Harsham, Bret, Hershey, John R.
Primary Examiner(s)
VO, HUYEN X

Application Number

US13/917,884
Publication Number

US 20140372120A1
Time in Patent Office

851 Days
Field of Search

704/251, 704/255, 704/257, 704/236, 704/243, 704/275, 704/246, 704 1- 10, 704/252, 704/239, 704/240, 704/235
US Class Current

1/1
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 2015/088   Word spotting

G10L 2015/228   of application context

System and method for recognizing speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

15 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for recognizing speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links