Selective speech recognition for chat and digital personal assistant systems
First Claim
1. A method for speech recognition in a Chat Information System (CIS), the method comprising:
- receiving, by a processor operatively coupled with a memory, a first audio input, the first audio input captured by a microphone of a device of a user;
recognizing, by a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, the processor being configured to select an output from the plurality of outputs based on the confidence levels;
providing, by the processor, a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user;
determining, by the processor, a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response;
receiving, by the processor, a second audio input that follows the response;
based on the determined response type of the response provided utilizing the CIS, selecting, by the processor, a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response;
recognizing, by the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and
providing, by the processor, a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are computer-implemented methods and systems for dynamic selection of speech recognition systems for the use in Chat Information Systems (CIS) based on multiple criteria and context of human-machine interaction. Specifically, once a first user audio input is received, it is analyzed so as to locate specific triggers, determine the context of the interaction or predict the subsequent user audio inputs. Based on at least one of these criteria, one of a free-diction recognizer, pattern-based recognizer, address book based recognizer or dynamically created recognizer is selected for recognizing the subsequent user audio input. The methods described herein increase the accuracy of automatic recognition of user voice commands, thereby enhancing overall user experience of using CIS, chat agents and similar digital personal assistant systems.
156 Citations
6 Claims
-
1. A method for speech recognition in a Chat Information System (CIS), the method comprising:
-
receiving, by a processor operatively coupled with a memory, a first audio input, the first audio input captured by a microphone of a device of a user; recognizing, by a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, the processor being configured to select an output from the plurality of outputs based on the confidence levels; providing, by the processor, a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determining, by the processor, a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receiving, by the processor, a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, selecting, by the processor, a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognizing, by the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and providing, by the processor, a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A Chat Information System (CIS), the CIS comprising:
-
a machine-readable medium storing instructions; one or more hardware processors executing the stored instructions to; receive a first audio input, the first audio input captured by a microphone; recognize, using a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, one or more of the processors being configured to select an output from the plurality of outputs based on the confidence levels; provide a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determine a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receive a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, select a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognize, using the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and provide a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device.
-
Specification