Controlling an apparatus based on speech
First Claim
1. A system for controlling an apparatus on basis of speech, comprising:
- a microphone array, comprising multiple microphones for receiving respective audio signals;
a beam forming module for extracting a speech signal of a user, from the audio signals as received by the microphones, by means of enhancing first components of the audio signals which represent an utterance originating from a first position of the user relative to the microphone array;
a speech recognition unit for creating an instruction for the apparatus based on recognized speech items of the speech signal; and
a keyword recognition system for recognition of a represented by a particular audio signal;
a speech control unit being arranged to control the beam forming module, on basis of the recognition of the predetermined keyword, in order to enhance second components of the audio signals which represent a subsequent utterance originating from a second position of the user relative to the microphone array;
wherein the recognition of the predetermined keyword at the second position calibrates the beam forming module to follow the user from the first position to the second position so that the subsequent utterance originating from the second position are accepted while utterances of other users at other positions are discarded, the second position including an orientation and a distance relative to the microphone array, and the speech control unit being configured to discriminate between sounds originating from users who are located in front of each other relative the microphone array;
wherein the subsequent utterance originating from the second position will be discarded if not preceded by the recognition of the predetermined keyword originating from the second position; and
wherein the keyword recognition system is arranged to recognize the predetermined keyword that is spoken by another user and the speech control unit being arranged to control the beam forming module, on basis of this recognition, in order to enhance third components of the audio signals which represent another utterance originating from a third orientation of the other user relative to the microphone array.
4 Assignments
0 Petitions
Accused Products
Abstract
An apparatus with a speech control unit includes a microphone array having multiple microphones for receiving respective audio signals, and a beam forming module for extracting a speech signal of a user, from the audio signals. A keyword recognition system recognizes a predetermined keyword that is spoken by the user and which is represented by a particular audio signal and is arranged to control the beam forming module, on basis of tie recognition. A speech recognition unit creates an instruction for the apparatus based on recognized speech items of the speech signal. As a consequence, the speech control unit is more selective for those parts of the audio signals for speech recognition which correspond to speech items spoken by the user.
33 Citations
19 Claims
-
1. A system for controlling an apparatus on basis of speech, comprising:
-
a microphone array, comprising multiple microphones for receiving respective audio signals; a beam forming module for extracting a speech signal of a user, from the audio signals as received by the microphones, by means of enhancing first components of the audio signals which represent an utterance originating from a first position of the user relative to the microphone array; a speech recognition unit for creating an instruction for the apparatus based on recognized speech items of the speech signal; and a keyword recognition system for recognition of a represented by a particular audio signal; a speech control unit being arranged to control the beam forming module, on basis of the recognition of the predetermined keyword, in order to enhance second components of the audio signals which represent a subsequent utterance originating from a second position of the user relative to the microphone array; wherein the recognition of the predetermined keyword at the second position calibrates the beam forming module to follow the user from the first position to the second position so that the subsequent utterance originating from the second position are accepted while utterances of other users at other positions are discarded, the second position including an orientation and a distance relative to the microphone array, and the speech control unit being configured to discriminate between sounds originating from users who are located in front of each other relative the microphone array; wherein the subsequent utterance originating from the second position will be discarded if not preceded by the recognition of the predetermined keyword originating from the second position; and wherein the keyword recognition system is arranged to recognize the predetermined keyword that is spoken by another user and the speech control unit being arranged to control the beam forming module, on basis of this recognition, in order to enhance third components of the audio signals which represent another utterance originating from a third orientation of the other user relative to the microphone array. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of controlling an apparatus on basis of speech, comprising the acts of:
-
receiving respective audio signals by means of a microphone array, comprising multiple microphones; extracting a speech signal of a user, from the audio signals as received by the microphones, by means of enhancing first components of the audio signals which represent an utterance originating from a first position of the user relative to the microphone array; recognizing a predetermined keyword that is spoken by based on a particular audio signal and controlling the extraction of the speech signal of the user, on basis of the recognition of the predetermined keyword, in order to enhance second components of the audio signals which represent a subsequent utterance originating from a second position of the user relative to the microphone array while discarding utterances of other users at other positions, the second position including an orientation and a distance relative co the microphone array so that sounds originating from users who are located in front of each other relative the microphone array are discriminated; creating an instruction for the apparatus based on recognized speech items of the speech signal; discarding the subsequent utterance originating from the second position if not preceded by the recognition of the predetermined keyword originating from the second position; and recognizing the predetermined keyword that is spoken by another and extracting a speech signal of the user, on basis of this recognition, in order to enhance third components of the audio signals which represent another utterance originating from a third orientation of the other user relative to the microphone array. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification