Dynamic language and command recognition

US 10,418,026 B2
Filed: 07/15/2016
Issued: 09/17/2019
Est. Priority Date: 07/15/2016
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, from an input device, audio data corresponding to a multi-lingual user command;

determining, by one or more processors, a plurality of acoustic models, wherein at least a first acoustic model of the plurality of acoustic models corresponds to a first language, and a second acoustic model of the plurality of acoustic models corresponds to a second language;

generating, based on the audio data, a first transcript of the multi-lingual user command using the first acoustic model, and a second transcript of the multi-lingual user command using the second acoustic model;

generating, from the first transcript, a first plurality of phrases corresponding to the first language;

generating, from the second transcript, a second plurality of phrases corresponding to the second language;

determining a phrase classifier for each phrase of the first plurality of phrases and the second plurality of phrases;

determining, based at least in part on the determined phrase classifiers, one or more match phrases, wherein each match phrase comprises;

one or more phrases of the first plurality of phrases corresponding to the first language; and

one or more phrases of the second plurality of phrases corresponding to the second language;

andsending, to a computing device and based on response scores for the one or more match phrases, an operational command.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user'"'"'s spoken command.

33 Citations

View as Search Results

21 Claims

1. A method comprising:
- receiving, from an input device, audio data corresponding to a multi-lingual user command;
  
  determining, by one or more processors, a plurality of acoustic models, wherein at least a first acoustic model of the plurality of acoustic models corresponds to a first language, and a second acoustic model of the plurality of acoustic models corresponds to a second language;
  
  generating, based on the audio data, a first transcript of the multi-lingual user command using the first acoustic model, and a second transcript of the multi-lingual user command using the second acoustic model;
  
  generating, from the first transcript, a first plurality of phrases corresponding to the first language;
  
  generating, from the second transcript, a second plurality of phrases corresponding to the second language;
  
  determining a phrase classifier for each phrase of the first plurality of phrases and the second plurality of phrases;
  
  determining, based at least in part on the determined phrase classifiers, one or more match phrases, wherein each match phrase comprises;
  
  one or more phrases of the first plurality of phrases corresponding to the first language; and
  
  one or more phrases of the second plurality of phrases corresponding to the second language;
  
  andsending, to a computing device and based on response scores for the one or more match phrases, an operational command.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein determining the plurality acoustic models further comprises:
    - receiving user input indicating the first acoustic model.
  - 3. The method of claim 1, wherein determining the plurality of acoustic models further comprises:
    - receiving user input indicating the first language.
  - 4. The method of claim 1, further comprising:
    - comparing each phrase of the first plurality of phrases and the second plurality of phrases to one or more command patterns; and
      
      determining, based on the comparison of each phrase of the first plurality of phrases and the second plurality of phrases to the one or more command patterns, an operational command.
  - 5. The method of claim 1, wherein determining the plurality of acoustic models further comprises:
    - determining, based at least in part on one of a media content consumption history of a user or a location of the user, an acoustic model to process the audio data.
  - 6. The method of claim 1, wherein the first acoustic model comprises a Spanish acoustic model, and the second acoustic model comprises an English acoustic model.
  - 7. The method of claim 1, wherein determining the phrase classifier for each phrase of the first plurality of phrases and the second plurality of phrases further comprises:
    - determining that a first phrase, of the first plurality of phrases, corresponds to a first type of phrase classifier; and
      
      determining a second type of phrase classifier for one or more remaining phrases of the first plurality of phrases.
  - 8. The method of claim 7, wherein the first type of phrase classifier comprises an action entity indicating a device command.
  - 9. The method of claim 1, wherein generating the first transcript of the multi-lingual user command using the first acoustic model further comprises:
    - comparing a first portion of the audio data to a database of voice templates; and
      
      determining, based on the first acoustic model, a first voice template corresponding to the first portion of the audio data.
  - 10. The method of claim 1, wherein sending the operational command further comprises:
    - determining a response score for each match phrase of the one or more match phrases; and
      
      determining, based on the one or more response scores, a first match phrase; and
      
      sending, to the computing device, the operational command corresponding to the determined first match phrase.

11. A method comprising:
- receiving, from an input device, audio data corresponding to a multi-lingual user command;
  
  determining, by one or more processors, a plurality of acoustic models, each acoustic model corresponding to a different language;
  
  for each acoustic model of the plurality of acoustic models;
  
  generating, based on the audio data, a transcript; and
  
  generating, from the transcript, a plurality of phrases;
  
  determining one or more match phrases, wherein each match phrase comprises at least;
  
  one or more phrases of the plurality of phrases generated from a first acoustic model corresponding to a first language; and
  
  one or more phrases of the plurality of phrases generated from a second acoustic model corresponding to a second language;
  
  andsending, to a computing device and based on response scores for the one or more match phrases, an operational command.
- View Dependent Claims (12, 13, 14, 15, 16, 17)
- - 12. The method of claim 11, wherein determining the one or more match phrases further comprises:
    - determining a phrase classifier for each phrase of the plurality of generated phrases; and
      
      determining, based at least in part on the determined phrase classifiers, the one or more match phrases.
  - 13. The method of claim 11, wherein determining the plurality of acoustic models further comprises:
    - determining, based at least in part on one of a media content consumption history of a user or a location of the user, an acoustic model.
  - 14. The method of claim 11, further comprising:
    - comparing each phrase of the plurality of generated phrases to a first command pattern; and
      
      determining, based on the comparison of each phrase of the plurality of generated phrases to the first command pattern, an operational command.
  - 15. The method of claim 14, wherein the first command pattern comprises a first content entity indicating a content item and a first action entity indicating a first device command, the method further comprising:
    - determining that the content item is unavailable; and
      
      modifying the first command pattern to comprise a second action entity indicating a second device command.
  - 16. The method of claim 11, wherein generating the transcript further comprises:
    - comparing a first portion of the audio data to a database of voice templates; and
      
      determining, based on the first acoustic model, a first voice template corresponding to the first portion of the audio data.
  - 17. The method of claim 16, further comprising:
    - receiving user input indicating the first acoustic model of the plurality of acoustic models.

18. A method comprising:
- receiving, from a first computing device, audio data indicating a multi-lingual command;
  
  generating, by one or more processors, a plurality of phrases corresponding to the multi-lingual command, wherein the plurality of phrases comprises one or more phrases corresponding to at least a first language and a second language;
  
  for each phrase of the plurality of phrases;
  
  determining, for the phrase, at least one of an action classifier or an entity classifier;
  
  determining, based on the determined classifiers, a plurality of match phrases, each match phrase corresponding to a prospective operational command,determining a response score for each of the plurality of match phrases; and
  
  sending, to a second computing device, a first operational command corresponding to a first match phrase satisfying a threshold score value.
- View Dependent Claims (19, 20, 21)
- - 19. The method of claim 18, further comprising:
    - determining, by the one or more processors, a ranking for each phrase of the plurality of generated phrases, wherein the at least one of the action classifier or the entity classifier is determined based on the determined ranking.
  - 20. The method of claim 19, wherein determining the ranking for each phrase of the plurality of generated phrases further comprises:
    - determining, based at least in part on a media content consumption history of a first user, the ranking.
  - 21. The method of claim 18, wherein determining the response score for each match phrase further comprises:
    - comparing each phrase, of the plurality of match phrases, to a first command pattern; and
      
      determining, based on the comparison, the response score.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Comcast Cable Communications LLC (Comcast Corporation)
Original Assignee
Comcast Cable Communications LLC (Comcast Corporation)
Inventors
Des Jardins, George Thomas, Sagar, Vikrant
Primary Examiner(s)
Roberts, Shaun

Application Number

US15/211,328
Publication Number

US 20180018959A1
Time in Patent Office

1,159 Days
Field of Search

704231, 704243, 704275
US Class Current
CPC Class Codes

G10L 15/005   Language recognition

G10L 15/08   Speech classification or se...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/24   Speech recognition using no...

G10L 15/32   Multiple recognisers used i...

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

Dynamic language and command recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

33 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Dynamic language and command recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

33 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links