Free-speech command classification for car navigation system

US 8,359,204 B2
Filed: 10/27/2008
Issued: 01/22/2013
Est. Priority Date: 10/26/2007
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for classifying speech data including one or more words, the method comprising the steps of:

storing a plurality of predefined commands, each predefined command including one or more words and each predefined command associated with an action;

receiving one or more alternate formats associated with each predefined command from a data source, each alternate format including one or more words and an identifier associating the alternate format with a predefined command;

for each predefined command, representing the predefined command with a sparse vector comprising the term frequency-inverse document frequency (“

TFIDF”

) weights for each word of the predefined command and the one or more alternate formats associated with the predefined command;

receiving the speech data;

generating a term frequency vector associated with the speech data;

determining, for each predefined command, the probability that the speech data is associated with the predefined command based on the sparse vector associated with each predefined command and the term frequency vector associated with the speech data;

associating the speech data with a predefined command based on the determined probabilities; and

executing the action associated with the predefined command associated with the speech data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides a system and method associating the freeform speech commands with one or more predefined commands from a set of predefined commands. The set of predefined commands are stored and alternate forms associated with each predefined command are retrieved from an external data source. The external data source receives the alternate forms associated with each predefined command from multiple sources so the alternate forms represent paraphrases of the predefined command. A representation including words from the predefined command and the alternate forms of the predefined command, such as a vector representation, is generated for each predefined command. A similarity value between received speech data and each representation of a predefined command is computed and the speech data is classified as the predefined command whose representation has the highest similarity value to the speech data.

Citations

27 Claims

1. A computer-implemented method for classifying speech data including one or more words, the method comprising the steps of:
- storing a plurality of predefined commands, each predefined command including one or more words and each predefined command associated with an action;
  
  receiving one or more alternate formats associated with each predefined command from a data source, each alternate format including one or more words and an identifier associating the alternate format with a predefined command;
  
  for each predefined command, representing the predefined command with a sparse vector comprising the term frequency-inverse document frequency (“
  
  TFIDF”
  
  ) weights for each word of the predefined command and the one or more alternate formats associated with the predefined command;
  
  receiving the speech data;
  
  generating a term frequency vector associated with the speech data;
  
  determining, for each predefined command, the probability that the speech data is associated with the predefined command based on the sparse vector associated with each predefined command and the term frequency vector associated with the speech data;
  
  associating the speech data with a predefined command based on the determined probabilities; and
  
  executing the action associated with the predefined command associated with the speech data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The computer-implemented method of claim 1, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques.
  - 3. The computer-implemented method of claim 1, wherein the data source comprises a lexical database including one or more words and one or more synonyms associated with each word.
  - 4. The computer-implemented method of claim 1, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques and a lexical database including one or more words and one or more synonyms associated with each word.
  - 5. The computer-implemented method of claim 1, wherein the determined probability is based on a cosine similarity value between the term frequency vector associated with the speech data and a sparse vector associated with a predefined command.
  - 6. The computer-implemented method of claim 1, wherein determining the probability that a predefined command is associated with the speech data comprises:
    - applying a Naï
      
      ve Bayes model to the term frequency vector associated with the speech data and each sparse vector associated with a predefined command to determine a probability of each of predefined command being associated with the speech data.
  - 7. The computer-implemented method of claim 1, wherein determining the probability that a predefined command is associated with the speech data comprises:
    - calculating a probability that each sparse vector associated with a predefined command includes the term frequency vector associated with the speech data using a unigram model.
  - 8. The computer-implemented method of claim 7, wherein the unigram model mixes a sparse vector associated with the predefined command and the set of sparse vectors according to a mixing parameter.
  - 9. The computer-implemented method of claim 1, wherein associating the speech data with a predefined command comprises:
    - determining a largest probability associated with a first predefined command;
      
      determining a second largest probability associated with a second predefined command;
      
      calculating a difference between the largest probability and the second largest probability;
      
      responsive to the difference not exceeding a threshold value, generating a request for an input selecting the first predefined command or the second predefined command; and
      
      associating the speech data with the first predefined command or the second predefined command responsive to the input.
  - 10. The computer-implemented method of claim 9, wherein associating the speech data with a predefined command further comprises:
    - responsive to the difference exceeding the threshold value, associating the speech data with the first predefined command.
  - 11. The computer-implemented method of claim 1, further comprising:
    - generating a language model for each predefined command;
      
      generating a language model for the set of predefined commands; and
      
      determining, for each predefined command, the probability that the speech data is associated with the predefined command further based on an estimation of the probability that each term in the term frequency vector associated with the speech data is associated with the predefined command based on the language model for the predefined command and the language model for the set of predefined commands.

12. A computer program product, comprising a non-transitory computer readable storage medium storing computer executable code classifying speech data including one or more words, the computer executable code performing the steps of:
- storing a plurality of predefined commands, each predefined command including one or more words and each predefined command associated with an action;
  
  receiving one or more alternate formats associated with each predefined command from a data source, each alternate format including one or more words and an identifier associating the alternate format with a predefined command;
  
  for each predefined command, representing the predefined command with a sparse vector comprising the term frequency-inverse document frequency (“
  
  TFIDF”
  
  ) weights for each word of the predefined command and the one or more alternate formats associated with the predefined command;
  
  receiving the speech data;
  
  generating a term frequency vector associated with the speech data;
  
  determining, for each predefined command, the probability that the speech data is associated with the predefined command based on the sparse vector associated with each predefined command and the term frequency vector associated with the speech data;
  
  associating the speech data with a predefined command based on the determined probabilities; and
  
  executing the action associated with the predefined command associated with the speech data.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 27)
- - 13. The computer program product of claim 12, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques.
  - 14. The computer program product of claim 12, wherein the data source comprises a lexical database including one or more words and one or more synonyms associated with each word.
  - 15. The computer program product of claim 12, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques and a lexical database including one or more words and one or more synonyms associated with each word.
  - 16. The computer program product of claim 12, wherein the determined probability is based on a cosine similarity value between the term frequency vector associated with the speech data and a sparse vector associated with a predefined command.
  - 17. The computer program product of claim 12, wherein determining the probability that a predefined command is associated with the speech data comprises:
    - applying a Naï
      
      ve Bayes model to the term frequency vector associated with the speech data and each sparse vector associated with a predefined command to determine a probability of each of predefined command being associated with the speech data.
  - 18. The computer program product of claim 12, wherein determining the probability that a predefined command is associated with the speech data comprises:
    - calculating a probability that each sparse vector associated with a predefined command includes the term frequency vector associated with the speech data using a unigram model.
  - 19. The computer program product of claim 18, wherein the unigram model mixes a sparse vector associated with the predefined command and the set of sparse vectors according to a mixing parameter.
  - 20. The computer program product of claim 12, wherein associating the speech data with a predefined command comprises:
    - determining a largest probability associated with a first predefined command;
      
      determining a second largest probability associated with a second predefined command;
      
      calculating a difference between the largest probability and the second largest probability;
      
      responsive to the difference not exceeding a threshold value, generating a request for an input selecting the first predefined command or the second predefined command; and
      
      associating the speech data the first predefined command or the second predefined command responsive to receiving the input.
  - 21. The computer program product of claim 20, wherein associating the speech data with a predefined command further comprises:
    - responsive to the difference exceeding the threshold value, associating the speech data with the first predefined command.
  - 22. The computer program product of claim 12, the computer executable code further performing the steps of:
    - generating a language model for each predefined command;
      
      generating a language model for the set of predefined commands; and
      
      determining, for each predefined command, the probability that the speech data is associated with the predefined command further based on an estimation of the probability that each term in the term frequency vector associated with the speech data is associated with the predefined command based on the language model for the predefined command and the language model for the set of predefined commands.
  - 27. The system of claim 12, further comprising a generative modeling module configured to:
    - generate a language model for each predefined command;
      
      generate a language model for the set of predefined commands; and
      
      determine, for each predefined command, the probability that the speech data is associated with the predefined command further based on an estimation of the probability that each term in the term frequency vector associated with the speech data is associated with the predefined command based on the language model for the predefined command and the language model for the set of predefined commands.

23. A system for classifying speech data received by a vehicle navigation system comprising:
- a predefined command store including a plurality of predefined commands, each predefined command including one or more words and each predefined command associated with an action modifying a vehicle system;
  
  an external data source including one or more alternate formats associated with each predefined command, each alternate format including one or more words and an identifier associating the alternate format with a predefined command;
  
  a sparse vector module configured to, for each predefined command, represent the predefined command with a sparse vector comprising the term frequency-inverse document frequency (“
  
  TFIDF”
  
  ) weights for each word of the predefined command and the one or more alternate formats associated with the predefined command;
  
  a speech recognition module for receiving the speech data and generating a term frequency vector associated with the speech data; and
  
  an interpretation module coupled to the predefined command store, the external data source and the speech recognition module, the interpretation module for receiving the term frequency vector associated with the speech data, determining for each predefined command the probability that the speech data is associated with a predefined command based on the sparse vector associated with the predefined commend and the term frequency vector associated with the speech data, and associating the speech data with a predefined command responsive to the determined probabilities.
- View Dependent Claims (24, 25, 26)
- - 24. The system of claim 23, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques.
  - 25. The system of claim 23, wherein the data source comprises a lexical database including one or more words and one or more synonyms associated with each word.
  - 26. The system of claim 23, wherein the data source comprises a website including data obtained from one or more sources using distributed data capture techniques and a lexical database including one or more words and one or more synonyms associated with each word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Inventors
Gupta, Rakesh
Primary Examiner(s)
Godbold, Douglas

Application Number

US12/259,196
Publication Number

US 20090112605A1
Time in Patent Office

1,548 Days
Field of Search

704/270, 704/275
US Class Current

704/275
CPC Class Codes

B60R 16/0373   Voice control in general G10L

G01C 21/3608   using speech input, e.g. us...

G06F 3/167   Audio in a user interface, ...

G10L 15/1815   Semantic context, e.g. disa...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/228   of application context

Free-speech command classification for car navigation system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Free-speech command classification for car navigation system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links