Speech recognition repair using contextual information

US 8,812,316 B1
Filed: 06/05/2014
Issued: 08/19/2014
Est. Priority Date: 09/28/2011
Status: Expired due to Fees

First Claim

Patent Images

1. A method for transcribing speech, the method comprising:

at an electronic device;

receiving a transcription of a spoken user request from a speech recognition system;

parsing the transcription into a plurality of tokens representing words in the spoken user request;

using a first interpreter, determining a first confidence level of a first alternative token for one of the plurality of tokens;

using a second interpreter, determining a second confidence level of a second alternative token for the one of the plurality of tokens; and

generating a repaired transcription by replacing the one of the plurality of tokens with the first alternative token or the second alternative token based on the first confidence level and the second confidence level.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom'"'"'s phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.

Citations

20 Claims

1. A method for transcribing speech, the method comprising:
- at an electronic device;
  
  receiving a transcription of a spoken user request from a speech recognition system;
  
  parsing the transcription into a plurality of tokens representing words in the spoken user request;
  
  using a first interpreter, determining a first confidence level of a first alternative token for one of the plurality of tokens;
  
  using a second interpreter, determining a second confidence level of a second alternative token for the one of the plurality of tokens; and
  
  generating a repaired transcription by replacing the one of the plurality of tokens with the first alternative token or the second alternative token based on the first confidence level and the second confidence level.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein determining the first confidence level comprises:
    - searching a database for tokens matching the one of the plurality of tokens.
  - 3. The method of claim 2, wherein determining the first confidence level further comprises:
    - generating the first confidence level based on an edit distance between the one of the plurality of tokens and the first alternative token.
  - 4. The method of claim 1, wherein determining the first confidence level comprises:
    - generating the first confidence level based on a context of the spoken user request.
  - 5. The method of claim 4, wherein the context comprises a history of spoken user requests.
  - 6. The method of claim 1, wherein the first interpreter uses a first algorithm to determine the first confidence level and the second interpreter uses a second algorithm to determine the second confidence level, wherein the first algorithm is different than the second algorithm.
  - 7. The method of claim 1, wherein the spoken user request is transcribed using an acoustic model and a language model of the speech recognition system.
  - 8. The method of claim 1, wherein the spoken user request comprises a command;
    - andwherein the method further comprises;
      
      transmitting the repaired transcription to an application corresponding to the command.

9. A non-transitory computer-readable storage medium comprising computer-executable instructions for performing a method comprising:
- receiving a transcription of a spoken user request from a speech recognition system;
  
  parsing the transcription into a plurality of tokens representing words in the spoken user request;
  
  using a first interpreter, determining a first confidence level of a first alternative token for one of the plurality of tokens;
  
  using a second interpreter, determining a second confidence level of a second alternative token for the one of the plurality of tokens; and
  
  generating a repaired transcription by replacing the one of the plurality of tokens with the first alternative token or the second alternative token based on the first confidence level and the second confidence level.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The computer-readable storage medium of claim 9, wherein determining the first confidence level comprises:
    - searching a database for tokens matching the one of the plurality of tokens.
  - 11. The computer-readable storage medium of claim 10, wherein determining the first confidence level further comprises:
    - generating the first confidence level based on an edit distance between the one of the plurality of tokens and the first alternative token.
  - 12. The computer-readable storage medium of claim 9, wherein determining the first confidence level comprises:
    - generating the first confidence level based on a context of the spoken user request.
  - 13. The computer-readable storage medium of claim 12, wherein the context comprises a history of spoken user requests.
  - 14. The computer-readable storage medium of claim 9, wherein the first interpreter uses a first algorithm to determine the first confidence level and the second interpreter uses a second algorithm to determine the second confidence level, wherein the first algorithm is different than the second algorithm.
  - 15. The computer-readable storage medium of claim 9, wherein the spoken user request comprises a command;
    - andwherein the method further comprises;
      
      transmitting the repaired transcription to an application corresponding to the command.

16. A system for transcribing speech, the system comprising:
- a memory; and
  
  a processor capable of executing a method comprising;
  
  receiving a transcription of a spoken user request from a speech recognition system;
  
  parsing the transcription into a plurality of tokens representing words in the spoken user request;
  
  using a first interpreter, determining a first confidence level of a first alternative token for one of the plurality of tokens;
  
  using a second interpreter, determining a second confidence level of a second alternative token for the one of the plurality of tokens; and
  
  generating a repaired transcription by replacing the one of the plurality of tokens with the first alternative token or the second alternative token based on the first confidence level and the second confidence level.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system of claim 16, wherein determining the first confidence level comprises:
    - searching a database for tokens matching the one of the plurality of tokens; and
      
      generating the first confidence level based on an edit distance between the one of the plurality of tokens and the first alternative token.
  - 18. The system of claim 16, wherein determining the first confidence level comprises:
    - generating the first confidence level based on a history of spoken user requests.
  - 19. The system of claim 16, wherein the first interpreter uses a first algorithm to determine the first confidence level and the second interpreter uses a second algorithm to determine the second confidence level, wherein the first algorithm is different than the second algorithm.
  - 20. The system of claim 16, wherein the spoken user request comprises a command;
    - andwherein the method further comprises;
      
      transmitting the repaired transcription to an application corresponding to the command.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Chen, Lik Harry
Primary Examiner(s)
GUERRA-ERAZO, EDGAR X

Application Number

US14/297,473
Time in Patent Office

75 Days
Field of Search

704/9, 704/10, 704/235, 704/245, 704/246, 704/251, 704/265, 704/270, 704/270.1, 704/275
US Class Current

704/235
CPC Class Codes

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

Speech recognition repair using contextual information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition repair using contextual information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links