Voice input correction using non-audio based input
First Claim
Patent Images
1. A method, comprising:
- accepting, at an audio receiver of an information handling device, voice input of a user and capturing, using a sensor, non-audio based input correlated with the voice input;
generating, using one or more speech recognition engines, an initial interpretation by interpreting the voice input without utilizing the non-audio based input for the initial interpretation;
identifying, using the one or more speech recognition engines, an ambiguous voice input comprising at least one ambiguity in the initial interpretation, wherein the identifying comprises identifying that at least a portion of the initial interpretation is associated with a confidence score meeting a predetermined low confidence threshold, wherein the confidence score is based in part on a condition of the user;
thereafter augmenting the one or more speech recognition engines and re-interpreting the ambiguous voice input by accessing, using the one or more speech recognition engines, based upon the confidence score meeting the predetermined low confidence threshold, stored non-audio based input matched in time with the ambiguous voice input, wherein the accessing is based upon a policy associated with a confidence level of interpretation, wherein the confidence level of interpretation is based on a device usage history, wherein the re-interpreting comprises mapping the stored non-audio based input to known features of the user while providing voice input correlated with the voice input; and
adjusting the initial interpretation of the voice input using non-audio based input, wherein the adjusting comprises changing the initial interpretation using the non-audio based input.
2 Assignments
0 Petitions
Accused Products
Abstract
An embodiment provides a method, including: accepting, at an audio receiver of an information handling device, voice input of a user; interpreting, using a processor, the voice input; identifying, using a processor, at least one ambiguity in interpreting the voice input; thereafter accessing stored non-audible input associated in time with the at least one ambiguity; and adjusting an interpretation of the voice input using non-audible input. Other aspects are described and claimed.
17 Citations
18 Claims
-
1. A method, comprising:
-
accepting, at an audio receiver of an information handling device, voice input of a user and capturing, using a sensor, non-audio based input correlated with the voice input; generating, using one or more speech recognition engines, an initial interpretation by interpreting the voice input without utilizing the non-audio based input for the initial interpretation; identifying, using the one or more speech recognition engines, an ambiguous voice input comprising at least one ambiguity in the initial interpretation, wherein the identifying comprises identifying that at least a portion of the initial interpretation is associated with a confidence score meeting a predetermined low confidence threshold, wherein the confidence score is based in part on a condition of the user; thereafter augmenting the one or more speech recognition engines and re-interpreting the ambiguous voice input by accessing, using the one or more speech recognition engines, based upon the confidence score meeting the predetermined low confidence threshold, stored non-audio based input matched in time with the ambiguous voice input, wherein the accessing is based upon a policy associated with a confidence level of interpretation, wherein the confidence level of interpretation is based on a device usage history, wherein the re-interpreting comprises mapping the stored non-audio based input to known features of the user while providing voice input correlated with the voice input; and adjusting the initial interpretation of the voice input using non-audio based input, wherein the adjusting comprises changing the initial interpretation using the non-audio based input. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An information handling device, comprising:
-
an audio receiver; a sensor that captures input; one or more processors; and a memory storing instructions that are executed by processor to; accept, at the audio receiver, voice input of a user and capture, using the sensor, non-audio based input correlated with the voice input; generate, using a speech recognition engine, an initial interpretation by interpreting the voice input without utilizing the non-audio based input for the initial interpretation; identify an ambiguous voice input comprising at least one ambiguity in the initial interpretation, wherein the identifying comprises identifying that at least a portion of the initial interpretation is associated with a confidence score meeting a predetermined low confidence threshold, wherein the confidence score is based in part on a condition of the user; thereafter augmenting the speech recognition engine and re-interpreting the ambiguous voice input by accessing, using the one or more processors, based upon the confidence score meeting the predetermined low confidence threshold, stored non-audio based input matched in time with the ambiguous voice input, wherein the accessing is based upon a policy associated with a confidence level of interpretation, wherein the confidence level of interpretation is based on a device usage history, wherein the re-interpreting comprises mapping the stored non-audio based input to known features of the user while providing voice input correlated with the voice input; and adjust the initial interpretation of the voice input using non-audio based input derived from the sensor, wherein to adjust comprises to change the initial interpretation using the non-audio based input. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A product, comprising:
-
a storage medium having device readable code stored therewith, the device readable code being executable by a processor and comprising; code that accepts voice input of a user and code that captures non-audio based input correlated with the voice input; code that generates, using a speech recognition engine, an initial interpretation by interpreting the voice without utilizing the non-audio based input for the initial interpretation; code that identifies an ambiguous voice input comprising at least one ambiguity in the initial interpretation, wherein the identifying comprises identifying that at least a portion of the initial interpretation is associated with a confidence score meeting a predetermined low confidence threshold, wherein the confidence score is based in part on a condition of the user; code that thereafter augmenting the speech recognition engine and re-interpreting the ambiguous voice input by accessing, based upon the confidence score meeting the predetermined low confidence threshold, stored non-audio based input matched in time with the ambiguous voice input, wherein the accessing is based upon a policy associated with a confidence level of interpretation, wherein the confidence level of interpretation is based on a device usage history, wherein the re-interpreting comprises mapping the stored non-audio based input to known features of the user while providing voice input correlated with the voice input; and code that adjusts the initial interpretation of the voice input using non-audio based input, wherein the code that adjusts comprises code that changes the initial interpretation using the non-audio based input.
-
Specification