Server-side ASR adaptation to speaker, device and noise condition via non-ASR audio transmission
First Claim
1. A mobile device adapted for automatic speech recognition (ASR) and employing at least one hardware implemented computer processor, the mobile device comprising:
- an input microphone for obtaining speech inputs from a user for automatic speech recognition;
an output interface for providing a system output to the user; and
a local controller configured to;
obtain a sample comprising non-ASR audio from the input microphone,provide a representation of the non-ASR audio to a remote ASR server for server-side adaptation to channel-specific ASR characteristics,obtain a sample comprising an unknown ASR speech input from the input microphone,provide a representation of the unknown ASR speech input to the remote ASR server,receive, from the remote ASR server, ASR results corresponding to the unknown ASR speech input, andprovide, based on the ASR results corresponding to the unknown ASR speech input, the system output to the output interface.
3 Assignments
0 Petitions
Accused Products
Abstract
A mobile device is adapted for automatic speech recognition (ASR). A user interface for interaction with a user includes an input microphone for obtaining speech inputs from the user for automatic speech recognition, and an output interface for system output to the user based on ASR results that correspond to the speech input. A local controller obtains a sample of non-ASR audio from the input microphone for ASR-adaptation to channel-specific ASR characteristics, and then provides a representation of the non-ASR audio to a remote ASR server for server-side adaptation to the channel-specific ASR characteristics, and then provides a representation of an unknown ASR speech input from the input microphone to the remote ASR server for determining ASR results corresponding to the unknown ASR speech input, and then provides the system output to the output interface.
8 Citations
20 Claims
-
1. A mobile device adapted for automatic speech recognition (ASR) and employing at least one hardware implemented computer processor, the mobile device comprising:
-
an input microphone for obtaining speech inputs from a user for automatic speech recognition; an output interface for providing a system output to the user; and a local controller configured to; obtain a sample comprising non-ASR audio from the input microphone, provide a representation of the non-ASR audio to a remote ASR server for server-side adaptation to channel-specific ASR characteristics, obtain a sample comprising an unknown ASR speech input from the input microphone, provide a representation of the unknown ASR speech input to the remote ASR server, receive, from the remote ASR server, ASR results corresponding to the unknown ASR speech input, and provide, based on the ASR results corresponding to the unknown ASR speech input, the system output to the output interface. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
obtaining, by an input microphone on a mobile device, a sample comprising non-automatic speech recognition (ASR) audio; transmitting, by the mobile device and to a server, a representation of the non-ASR audio for server-side adaptation to channel-specific ASR characteristics; receiving, by the input microphone on the mobile device, a sample comprising unknown ASR speech input; transmitting, by the mobile device and to the server, a representation of the unknown ASR speech input; receiving, from the server, ASR results corresponding to the unknown ASR speech input; and outputting, by the mobile device, the ASR results. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium having computer-executable program instructions stored thereon that, when executed by a processor, cause the processor to:
-
obtain, using a microphone, a sample comprising non-automatic speech recognition (ASR) audio; transmit, to a server, a representation of the non-ASR audio for server-side adaptation to channel-specific ASR characteristics; obtain, using the microphone, a sample comprising unknown ASR speech input; transmit, to the server, a representation of the unknown ASR speech input; receive, from the server, ASR results corresponding to the unknown ASR speech input; and the ASR results. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification