Automated learning for speech-based applications

US 9,418,652 B2
Filed: 01/30/2015
Issued: 08/16/2016
Est. Priority Date: 09/11/2008
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, by a computer-based speech recognition system, a speech input and a first task associated with the speech input, the first task being determined by processing the speech input using an original set of internal representations for the computer-based speech recognition system, the original set of internal representations comprising one or more parameters for recognizing speech in the speech input, the computer-based speech recognition system comprising at least one processor and at least one memory device;

comparing the first task with a second task associated with the speech input, the second task being identified by a human in response to hearing the speech input;

based at least in part on the comparison, modifying the original set of internal representations to create a modified set of internal representations;

processing, by the at least one processor of the computer-based speech recognition system and based at least partly on the modified set of internal representations, the speech input to identify a third task for the speech input;

comparing the third task to the second task to determine that the third task is within an acceptable margin of error to the second task;

in response to determining that the third task is within an acceptable margin of error to the second task, replacing the original set of internal representations of the computer-based speech recognition system with the modified set of internal representations;

receiving, by the computer-based speech recognition system, another speech input; and

processing, by the at least one processor and based at least partly on the modified set of internal representations, the other speech input to identify a particular task for the other speech input.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for modifying a computer-based speech recognition system. A speech utterance is processed with the computer-based speech recognition system using a set of internal representations, which may comprise parameters for recognizing speech in a speech utterance, such as parameters of an acoustic model and/or a language model. The computer-based speech recognition system may perform a first task in response to the processed speech utterance. The utterance may also be provided to a human who performs a second task based on the utterance. Data indicative of the first task, performed by the computer system, is compared to data indicative of a second task, performed by the human in response to the speech utterance. Based on the comparison, the set of internal representations may be updated or modified to improve the speech recognition performance and capabilities of the speech recognition system.

Citations

19 Claims

1. A method comprising:
- receiving, by a computer-based speech recognition system, a speech input and a first task associated with the speech input, the first task being determined by processing the speech input using an original set of internal representations for the computer-based speech recognition system, the original set of internal representations comprising one or more parameters for recognizing speech in the speech input, the computer-based speech recognition system comprising at least one processor and at least one memory device;
  
  comparing the first task with a second task associated with the speech input, the second task being identified by a human in response to hearing the speech input;
  
  based at least in part on the comparison, modifying the original set of internal representations to create a modified set of internal representations;
  
  processing, by the at least one processor of the computer-based speech recognition system and based at least partly on the modified set of internal representations, the speech input to identify a third task for the speech input;
  
  comparing the third task to the second task to determine that the third task is within an acceptable margin of error to the second task;
  
  in response to determining that the third task is within an acceptable margin of error to the second task, replacing the original set of internal representations of the computer-based speech recognition system with the modified set of internal representations;
  
  receiving, by the computer-based speech recognition system, another speech input; and
  
  processing, by the at least one processor and based at least partly on the modified set of internal representations, the other speech input to identify a particular task for the other speech input.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the third task is the same as the second task.
  - 3. The method of claim 1, wherein the internal representations comprise statistical representations of sounds that make up words.
  - 4. The method of claim 1, wherein the speech input is received as part of a voice communication.
  - 5. The method of claim 1, wherein the original set of internal representations and the modified set of internal representations include a parameter of an acoustic model that is used by the computer-based speech recognition system.
  - 6. The method of claim 1, wherein:
    - the first task and the third task are machine transcriptions of the speech input; and
      
      the second task is a human transcription of the speech input.
  - 7. The method of claim 1, further comprising:
    - determining that the original set of internal representations misinterpreted the speech input when processing the speech input to determine the first task.

8. A system, comprising:
- one or more processors; and
  
  memory, communicatively coupled to the one or more processors, containing instructions that, when executed, configure the one or more processors to perform operations comprising;
  
  receiving a speech utterance and a first task for the speech utterance, the first task being determined by processing the speech utterance using an original set of statistical representations stored in the memory of the system, the original set of statistical representations comprising one or more parameters for recognizing speech in the speech utterance;
  
  comparing the first task with a second task for the speech utterance, the second task being identified by a human in response to hearing the speech utterance;
  
  based at least in part on the comparison, updating the original set of statistical representations to create an updated set of statistical representations;
  
  processing, with the updated set of statistical representations, the speech input to identify a third task for the speech utterance;
  
  comparing the third task to the second task to determine that the third task is within an acceptable margin of error to the second task;
  
  in response to determining that the third task is within an acceptable margin of error, replacing the original set of statistical representations of the system with the updated set of statistical representations;
  
  receiving another speech utterance; and
  
  processing, by the one or more processors and based at least partly on the updated set of statistical representations, the other speech utterance to identify a particular task for the other speech utterance.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein:
    - the system comprises a speech recognition module; and
      
      the statistical representations include a parameter of an acoustic model of the speech recognition module.
  - 10. The system of claim 9, whereinthe system comprises a language model;
    - andthe language model includes probabilities of sequences of words.
  - 11. The system of claim 9, wherein receiving the speech utterance further comprises:
    - identifying one or more acoustic features contained in the speech utterance;
      
      extracting the one or more acoustic features contained in the speech utterance;
      
      comparing the one or more acoustic features with the parameter of the acoustic model of the speech recognition module to identify one or more words contained in the speech utterance; and
      
      comparing the one or more words with the probabilities of sequences of words of the language model to determine a probability that the one or more words were spoken, the determining being based at least in part on context of the sequence of words.
  - 12. The system of claim 8, wherein the operations further comprise:
    - determining that the original set of statistical representations misinterpreted the speech utterance when processing the speech utterance to determine the first task.
  - 13. The system of claim 8, wherein the third task is the same as the second task.
  - 14. The system of claim 8, wherein the speech utterance is received as part of a telephone call or other voice communication.

15. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more processors, configure the processor to perform acts comprising:
- identifying a speech input and a first task associated with the speech input, the first task being determined by processing the speech input using an original set of internal representations of a computer-based speech recognition system, the original set of internal representations comprising one or more parameters for recognizing speech in the speech input;
  
  comparing the first task with a second task associated with the speech input, the second task being identified by a human in response to hearing the speech input;
  
  based at least in part on the comparison, modifying the original set of internal representations to create a modified set of internal representations;
  
  processing, by the one or more processors and using the modified set of internal representations, the speech input to identify a third task for the speech input;
  
  comparing the third task to the second task to determine whether the third task is within an acceptable margin of error to the second task;
  
  in response to determining that the third task is within an acceptable margin of error, replacing the original set of internal representations of the computer-based speech recognition system with the modified set of internal representations;
  
  receiving another speech input; and
  
  processing, by the one or more processors and using the modified set of internal representations, the other speech input to identify a particular task for the other speech input.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The one or more non-transitory computer-readable storage media of claim 15, wherein at least one of the first task or the second task comprises at least one of accessing account information, accessing flight information, accessing an employee directory, opening a file, or inputting GPS information.
  - 17. The one or more non-transitory computer-readable storage media of claim 15, wherein:
    - the first task and third task are a machine transcription of the speech input; and
      
      the second task is a human transcription of the speech input.
  - 18. The one or more non-transitory computer-readable storage media of claim 17, wherein:
    - comparing the first task with the second task comprises identifying differences between the machine transcription and the human transcription; and
      
      modifying the original set of internal representations to create a modified set of internal representations comprises adjusting the set of internal representations based on the identified differences.
  - 19. The one or more non-transitory computer-readable storage media of claim 15, the acts further comprising:
    - determining that the original set of statistical representations misinterpreted the speech input when processing the speech input to determine the first task.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Verint Americas Incorporated (Verint Systems Incorporated)
Original Assignee
Next IT Corporation (Verint Systems Incorporated)
Inventors
Wooters, Charles C
Primary Examiner(s)
JACKSON, JAKIEDA R

Application Number

US14/610,891
Publication Number

US 20150213795A1
Time in Patent Office

564 Days
Field of Search

704/235, 704/9
US Class Current

1/1
CPC Class Codes

G10L 15/065   Adaptation

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2015/0638   Interactive procedures

G10L 2015/223   Execution procedure of a sp...

Automated learning for speech-based applications

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Automated learning for speech-based applications

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links