Dynamic language and command recognition
First Claim
1. A method comprising:
- receiving, from an input device, audio data corresponding to a multi-lingual user command;
determining, by one or more processors, a plurality of acoustic models, wherein at least a first acoustic model of the plurality of acoustic models corresponds to a first language, and a second acoustic model of the plurality of acoustic models corresponds to a second language;
generating, based on the audio data, a first transcript of the multi-lingual user command using the first acoustic model, and a second transcript of the multi-lingual user command using the second acoustic model;
generating, from the first transcript, a first plurality of phrases corresponding to the first language;
generating, from the second transcript, a second plurality of phrases corresponding to the second language;
determining a phrase classifier for each phrase of the first plurality of phrases and the second plurality of phrases;
determining, based at least in part on the determined phrase classifiers, one or more match phrases, wherein each match phrase comprises;
one or more phrases of the first plurality of phrases corresponding to the first language; and
one or more phrases of the second plurality of phrases corresponding to the second language;
andsending, to a computing device and based on response scores for the one or more match phrases, an operational command.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user'"'"'s spoken command.
33 Citations
21 Claims
-
1. A method comprising:
-
receiving, from an input device, audio data corresponding to a multi-lingual user command; determining, by one or more processors, a plurality of acoustic models, wherein at least a first acoustic model of the plurality of acoustic models corresponds to a first language, and a second acoustic model of the plurality of acoustic models corresponds to a second language; generating, based on the audio data, a first transcript of the multi-lingual user command using the first acoustic model, and a second transcript of the multi-lingual user command using the second acoustic model; generating, from the first transcript, a first plurality of phrases corresponding to the first language; generating, from the second transcript, a second plurality of phrases corresponding to the second language; determining a phrase classifier for each phrase of the first plurality of phrases and the second plurality of phrases; determining, based at least in part on the determined phrase classifiers, one or more match phrases, wherein each match phrase comprises; one or more phrases of the first plurality of phrases corresponding to the first language; and one or more phrases of the second plurality of phrases corresponding to the second language; and sending, to a computing device and based on response scores for the one or more match phrases, an operational command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method comprising:
-
receiving, from an input device, audio data corresponding to a multi-lingual user command; determining, by one or more processors, a plurality of acoustic models, each acoustic model corresponding to a different language; for each acoustic model of the plurality of acoustic models; generating, based on the audio data, a transcript; and generating, from the transcript, a plurality of phrases; determining one or more match phrases, wherein each match phrase comprises at least; one or more phrases of the plurality of phrases generated from a first acoustic model corresponding to a first language; and one or more phrases of the plurality of phrases generated from a second acoustic model corresponding to a second language; and sending, to a computing device and based on response scores for the one or more match phrases, an operational command. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A method comprising:
-
receiving, from a first computing device, audio data indicating a multi-lingual command; generating, by one or more processors, a plurality of phrases corresponding to the multi-lingual command, wherein the plurality of phrases comprises one or more phrases corresponding to at least a first language and a second language; for each phrase of the plurality of phrases;
determining, for the phrase, at least one of an action classifier or an entity classifier;determining, based on the determined classifiers, a plurality of match phrases, each match phrase corresponding to a prospective operational command, determining a response score for each of the plurality of match phrases; and sending, to a second computing device, a first operational command corresponding to a first match phrase satisfying a threshold score value. - View Dependent Claims (19, 20, 21)
-
Specification