Systems and Methods for Concurrent Signal Recognition
First Claim
1. A non-transitory computer-readable storage medium having instructions stored thereon that, upon execution by a computer system, cause the computer system to perform operations comprising:
- identifying a plurality of models, wherein each model corresponds to a respective utterance spoken by a person;
receiving a representation of a speech mixture including utterances concurrently spoken by at least two persons;
combining spectral vectors of the plurality of models into a set of spectral vectors;
calculating mixture weights for one or more vectors of the set of spectral vectors based, at least in part, on the representation of the speech mixture; and
identifying a concurrently spoken utterance in the speech mixture based, at least in part, on the mixture weights.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for recognition of concurrent, superimposed, or otherwise overlapping signals are described. A Markov Selection Model is introduced that, together with probabilistic decomposition methods, enable recognition of simultaneously emitted signals from various sources. For example, a signal mixture may include overlapping speech from different persons. In some instances, recognition may be performed without the need to separate signals or sources. As such, some of the techniques described herein may be useful in automatic transcription, noise reduction, teaching, electronic games, audio search and retrieval, medical and scientific applications, etc.
-
Citations
20 Claims
-
1. A non-transitory computer-readable storage medium having instructions stored thereon that, upon execution by a computer system, cause the computer system to perform operations comprising:
-
identifying a plurality of models, wherein each model corresponds to a respective utterance spoken by a person; receiving a representation of a speech mixture including utterances concurrently spoken by at least two persons; combining spectral vectors of the plurality of models into a set of spectral vectors; calculating mixture weights for one or more vectors of the set of spectral vectors based, at least in part, on the representation of the speech mixture; and identifying a concurrently spoken utterance in the speech mixture based, at least in part, on the mixture weights. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method, comprising:
performing, by one or more computing devices; identifying a first model corresponding to a first signal emitted by a first source, wherein the first model includes a first set of dictionaries and each of the first set of dictionaries includes a first set of spectral vectors; identifying a second model corresponding to a second signal emitted by a second source, wherein the second model includes a second set of dictionaries and each of the second set of dictionaries includes a second set of spectral vectors; receiving a representation of a signal mixture, wherein the signal mixture includes signals emitted by the first and second sources at least partially simultaneously; combining spectral vectors of the first and second models into a superset of spectral vectors; calculating a weight for each spectral vector of the superset of spectral vectors with respect to the signal mixture; and recognizing at least one of the first and second signals within the signal mixture based, at least in part, on the calculated weights. - View Dependent Claims (11, 12, 13, 14, 15)
-
16. A system, comprising:
-
at least one processor; and a memory coupled to the at least one processor, wherein the memory stores program instructions, and wherein the program instructions are executable by the at least one processor to perform operations including; receiving a representation of a signal mixture including a first signal emitted by a first source and a second signal emitted by a second source, wherein, within the signal mixture, the first and second signals overlap in time; and recognizing the first signal within the signal mixture without separating the first signal from the second signal. - View Dependent Claims (17, 18, 19, 20)
-
Specification