Speech recognition apparatus and method
First Claim
1. A speech recognition apparatus, comprising:
- a camera configured to capture a plurality of images of a user;
a microphone;
a control unit configured to;
track at least one eye of the user based on the plurality of images of the user;
determine a reference time at which the tracked at least one eye of the user is directed toward the microphone;
determine whether a nonlexical word is detected in a first speech signal received via the microphone during a period of time beginning from the reference time at which the tracked at least one eye of the user is directed toward the microphone; and
based on a determination that the nonlexical word is detected in the first speech signal during the period of time beginning from the reference time, determine a second speech signal received via the microphone subsequent to the detected nonlexical word; and
a speech recognition unit configured to recognize a speech of the user from the second speech signal.
1 Assignment
0 Petitions
Accused Products
Abstract
The present specification relates to a speech recognition apparatus and method capable of accurately recognizing the speech of a user in an easy and convenient manner without the user having to operate a speech recognition start button or the like. The speech recognition apparatus according to embodiments of the present specification comprises: a camera for capturing a user image; a microphone; a control unit for detecting a preset user gesture from the user image, and, if a nonlexical word is detected from the speech signal which is input through the microphone from the point in time at which the user gesture was detected, determining the speech signal detected after the detected nonlexical word as an effective speech signal; and a speech recognition unit for recognizing the effective speech signal.
-
Citations
18 Claims
-
1. A speech recognition apparatus, comprising:
-
a camera configured to capture a plurality of images of a user; a microphone; a control unit configured to; track at least one eye of the user based on the plurality of images of the user; determine a reference time at which the tracked at least one eye of the user is directed toward the microphone; determine whether a nonlexical word is detected in a first speech signal received via the microphone during a period of time beginning from the reference time at which the tracked at least one eye of the user is directed toward the microphone; and based on a determination that the nonlexical word is detected in the first speech signal during the period of time beginning from the reference time, determine a second speech signal received via the microphone subsequent to the detected nonlexical word; and a speech recognition unit configured to recognize a speech of the user from the second speech signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech recognition method, comprising:
-
capturing, via a camera, a plurality of images of a user; tracking at least one eye of the user based on the plurality of images of the user; determining a reference time at which the tracked at least one eye of the user is directed toward a microphone; determining whether a nonlexical word is detected in a first speech signal received via the microphone during a period of time beginning from the reference time at which the tracked at least one eye of the user is directed toward the microphone; based on a determination that the nonlexical word is detected in the first speech signal during the period of time beginning from the reference time, determine a second speech signal received via the microphone subsequent to the detected nonlexical word; and recognizing, via a speech recognition unit, a speech of the user from the second speech signal. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification