SPEECH RECOGNITION DEPENDENT ON TEXT MESSAGE CONTENT
First Claim
Patent Images
1. A method of automatic speech recognition, comprising the steps of:
- a) receiving from a user, an utterance in reply to a text message via a microphone that converts the reply utterance into a speech signal;
b) pre-processing the speech signal using at least one processor to extract acoustic data from the speech signal;
c) identifying an acoustic model of a plurality of acoustic models to decode the acoustic data, using a conversational context associated with the text message; and
d) decoding the acoustic data using the identified acoustic model to produce a plurality of hypotheses for the reply utterance.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of automatic speech recognition. An utterance is received from a user in reply to a text message, via a microphone that converts the reply utterance into a speech signal. The speech signal is processed using at least one processor to extract acoustic data from the speech signal. An acoustic model is identified from a plurality of acoustic models to decode the acoustic data, and using a conversational context associated with the text message. The acoustic data is decoded using the identified acoustic model to produce a plurality of hypotheses for the reply utterance.
-
Citations
17 Claims
-
1. A method of automatic speech recognition, comprising the steps of:
-
a) receiving from a user, an utterance in reply to a text message via a microphone that converts the reply utterance into a speech signal; b) pre-processing the speech signal using at least one processor to extract acoustic data from the speech signal; c) identifying an acoustic model of a plurality of acoustic models to decode the acoustic data, using a conversational context associated with the text message; and d) decoding the acoustic data using the identified acoustic model to produce a plurality of hypotheses for the reply utterance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of automatic speech recognition, comprising the steps of:
-
a) receiving a text message at a speech recognition client device; b) processing the text message with conversational context-specific language models stored on the client device using at least one processor of the client device to identify a conversational context corresponding to the text message; c) synthesizing speech from the text message; d) communicating the synthesized speech via a loudspeaker of the client device to a user of the client device; e) receiving a reply utterance from the user via a microphone of the client device that converts the reply utterance into a speech signal; f) pre-processing the speech signal using the at least one processor to extract acoustic data from the received speech signal; g) communicating the extracted acoustic data and the identified conversational context to a speech recognition server; h) identifying an acoustic model of a plurality of acoustic models stored at the server to decode the acoustic data, using the identified conversational context; i) decoding the acoustic data using the identified acoustic model to produce a plurality of hypotheses for the reply utterance; and j) post-processing the plurality of hypotheses to identify one of the hypotheses as the reply utterance. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A method of automatic speech recognition, comprising the steps of:
-
a) receiving a text message at a speech recognition client device; b) processing the text message with conversational context-specific language models stored on the client device using at least one processor of the client device to identify a conversational context corresponding to the text message; c) synthesizing speech from the text message; d) communicating the synthesized speech via a loudspeaker of the client device to a user of the client device; e) receiving a reply utterance from the user via a microphone of the client device that converts the reply utterance into a speech signal; f) pre-processing the speech signal using the at least one processor to extract acoustic data from the received speech signal; g) identifying an acoustic model of a plurality of acoustic models to decode the acoustic data, using the identified conversational context associated with the text message; h) decoding the acoustic data using the identified acoustic model to produce a plurality of hypotheses for the reply utterance; i) determining whether a confidence value associated with at least one of the plurality of hypotheses for the reply utterance is greater or less than a confidence threshold; j) communicating the extracted acoustic data and the conversational context to a speech recognition server, if the confidence value is determined to be less than the confidence threshold, otherwise post-processing the plurality of hypotheses to identify one of the hypotheses as the reply utterance, and outputting from the client device the identified hypothesis as at least part of a reply text message; k) identifying at the server, an acoustic model of a plurality of acoustic models stored at the server to decode the acoustic data, using the identified conversational context; l) decoding the acoustic data using the acoustic model identified at the server to produce a plurality of hypotheses for the reply utterance; m) post-processing the plurality of hypotheses to identify one of the hypotheses as the reply utterance; and n) outputting from the server the identified hypothesis as at least part of a reply text message.
-
Specification