System, method and program product for providing automatic speech recognition (ASR) in a shared resource environment
First Claim
1. An Automatic Speech Recognition (ASR) method comprising:
- extracting utterances from each of one or more audio streams, each audio stream being associated with a particular context;
one or more computers extracting said utterances;
generating textual candidates for each extracted utterance, one or more utterances having a plurality of textual candidates generated as potential matches;
winnowing potential matches automatically with a context model within each particular context to adjust the likelihood of potential matches;
selecting a single textual candidate for said each extracted utterance as a match, any selected textual candidate not having been previously matched for the current context being a new match; and
updating said context model responsive to each match, wherein updating either adds the selected said textual candidate to said context model for the new match, or increases the likelihood for the previously selected said single textual candidate for the same particular context in the updated said context model, and said updated context model is used for winnowing subsequently extracted utterances.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system, method of recognizing speech and a computer program product therefor. A client device identified with a context for an associated user selectively streams audio to a provider computer, e.g., a cloud computer. Speech recognition receives streaming audio, maps utterances to specific textual candidates and determines a likelihood of a correct match for each mapped textual candidate. A context model selectively winnows candidate to resolve recognition ambiguity according to context whenever multiple textual candidates are recognized as potential matches for the same mapped utterance. Matches are used to update the context model, which may be used for multiple users in the same context.
-
Citations
25 Claims
-
1. An Automatic Speech Recognition (ASR) method comprising:
-
extracting utterances from each of one or more audio streams, each audio stream being associated with a particular context;
one or more computers extracting said utterances;generating textual candidates for each extracted utterance, one or more utterances having a plurality of textual candidates generated as potential matches; winnowing potential matches automatically with a context model within each particular context to adjust the likelihood of potential matches; selecting a single textual candidate for said each extracted utterance as a match, any selected textual candidate not having been previously matched for the current context being a new match; and updating said context model responsive to each match, wherein updating either adds the selected said textual candidate to said context model for the new match, or increases the likelihood for the previously selected said single textual candidate for the same particular context in the updated said context model, and said updated context model is used for winnowing subsequently extracted utterances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An Automatic Speech Recognition (ASR) method for recognizing speech without prior training, said ASR method comprising:
-
receiving one or more audio streams from one or more client devices, each client device being associated with a particular context; extracting utterances from each of said one or more audio streams; generating textual candidates for each extracted utterance and a probability that each candidate is a match, every utterance having a single textual candidate generated is matched, remaining utterances having a plurality of textual candidates generated are unmatched, each of said plurality of textual candidates being a potential match; winnowing said plurality of textual candidates automatically for each of said remaining utterances with a context model within each said particular context to adjust likelihood of potential matches; selecting a single textual candidate for said extracted utterance as a match, any selected textual candidate not having been previously matched for the current context being a new match; and updating said context model responsive to each match, wherein updating either adds the selected said textual candidate to said context model for the new match, or increases the likelihood for the previously selected said single textual candidate for the same particular context in the updated said context model and said updated context model is used for winnowing subsequently extracted utterances. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer program product for Automatic Speech Recognition (ASR), said computer program product comprising a non-transitory computer usable medium having computer readable program code stored thereon, said computer readable program code causing one or more computer executing said code to:
-
extract utterances from each of one or more audio streams, each audio stream being associated with a particular context; generate textual candidates for each extracted utterance, each textual candidate including a probability of matching said each extracted utterance, one or more utterances having a plurality of textual candidates generated as potential matches; winnowing potential matches for said one or more utterances automatically with a context model within each respective said particular context for the respective utterance to adjust the likelihood of each of the respective potential matches; select a single textual candidate for said each extracted utterance as a match, any selected textual candidate not having been previously matched for the current context being a new match; and update said context model responsive to each match, wherein updating either adds the selected said textual candidate to said context model for the new match, or increases the likelihood for the previously selected said single textual candidate for the same particular context in the updated said context model and said updated context model is used for winnowing subsequently extracted utterances. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A computer program product for Automatic Speech Recognition (ASR), said computer program product comprising a non-transitory computer usable medium having computer readable program code stored thereon, said computer readable program code causing a plurality of computers including provider computers executing said code to:
-
receive one or more audio streams from one or more client devices, each client device being associated with a particular context; extract utterances from each of said one or more audio streams; generate textual candidates for each extracted utterance and a probability that each candidate is a match, every utterance having a single textual candidate generated is matched, remaining utterances having a plurality of textual candidates generated are unmatched, each of said plurality of textual candidates being a potential match; winnow said plurality of textual candidates for each of said remaining utterances automatically with a context model within each said particular context to adjust likelihood of potential matches; select a single textual candidate for said extracted utterance as a match, any selected textual candidate not having been previously matched for the current context being a new match; and update said context model responsive to each match, wherein updating either adds the selected said textual candidate to said context model for the new match, or increases the likelihood for the previously selected said single textual candidate for the same particular context in the updated said context model and said updated context model is used for winnowing subsequently extracted utterances. - View Dependent Claims (23, 24, 25)
-
Specification