Truly handsfree speech recognition in high noise environments
First Claim
1. A method comprising:
- receiving a first audio signal;
identifying the first audio signal;
selecting one of a plurality of recognition sets to recognize one or more predetermined utterances based on the identified first audio signal;
configuring a recognizer to recognize the one or more predetermined utterances in the presence of a non-random information bearing background audio signal having particular audio characteristics, said configuring desensitizing the recognizer to signals having said particular audio characteristics;
receiving, in the recognizer, a composite signal comprising the first audio signal and a spoken utterance of a user, wherein the first audio signal is generated by an electronic speaker, wherein the first audio signal comprises said particular audio characteristics used to configure the recognizer so that the recognizer is desensitized to the first audio signal;
recognizing the spoken utterance in the presence of the first audio signal when the spoken utterance of the user is one of the predetermined utterances;
executing a command corresponding to a particular one of the predetermined utterances having been recognized; and
performing an operation corresponding to the command on the first audio signal in response to the command,wherein when different audio signals are identified, different recognition sets are dynamically selected and used to configure the recognizer so that the recognizer is desensitized to the identified audio signals, and wherein the different audio signals are associated with different commands executed when an utterance is recognized.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention improve content manipulation systems and methods using speech recognition. In one embodiment, the present invention includes a method comprising configuring a recognizer to recognize utterances in the presence of a background audio signal having particular audio characteristics. A composite signal comprising a first audio signal and a spoken utterance of a user is received by the recognizer, where the first audio signal comprises the particular audio characteristics used to configure the recognizer so that the recognizer is desensitized to the first audio signal. The spoke utterance is recognized in the presence of the first audio signal when the spoken utterance is one of the predetermined utterances. An operation is performed on the first audio signal.
23 Citations
22 Claims
-
1. A method comprising:
-
receiving a first audio signal; identifying the first audio signal; selecting one of a plurality of recognition sets to recognize one or more predetermined utterances based on the identified first audio signal; configuring a recognizer to recognize the one or more predetermined utterances in the presence of a non-random information bearing background audio signal having particular audio characteristics, said configuring desensitizing the recognizer to signals having said particular audio characteristics; receiving, in the recognizer, a composite signal comprising the first audio signal and a spoken utterance of a user, wherein the first audio signal is generated by an electronic speaker, wherein the first audio signal comprises said particular audio characteristics used to configure the recognizer so that the recognizer is desensitized to the first audio signal; recognizing the spoken utterance in the presence of the first audio signal when the spoken utterance of the user is one of the predetermined utterances; executing a command corresponding to a particular one of the predetermined utterances having been recognized; and performing an operation corresponding to the command on the first audio signal in response to the command, wherein when different audio signals are identified, different recognition sets are dynamically selected and used to configure the recognizer so that the recognizer is desensitized to the identified audio signals, and wherein the different audio signals are associated with different commands executed when an utterance is recognized. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus comprising:
-
a processor; a recognizer, the recognizer configured to recognize one or more predetermined utterances in the presence of a non-random information bearing background audio signal having particular audio characteristics to desensitize the recognizer to signals having said particular audio characteristics; and a microphone to receive a composite signal comprising a first audio signal and a spoken utterance of a user, wherein the first audio signal is generated by an electronic speaker, wherein the first audio signal comprises said particular audio characteristics used to configure the recognizer so that the recognizer is desensitized to the first audio signal, wherein the spoken utterance is recognized in the presence of the first audio signal when the spoken utterance of the user is one of the predetermined utterances, wherein a command is executed by said processor corresponding to a particular one of the predetermined utterances having been recognized; and
wherein an operation corresponding to the command is performed on the first audio signal in response to the command, wherein, before configuring the recognizer, the first audio signal is identified, and wherein one of a plurality of recognition sets is selected to recognize said one or more predetermined utterances based on the identified first audio signal, and wherein when different audio signals are identified, different recognition sets are dynamically selected and used to configure the recognizer so that the recognizer is desensitized to the identified audio signals, and wherein the different audio signals are associated with different commands executed when an utterance is recognized. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification