Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition
First Claim
Patent Images
1. A method comprising:
- analyzing acoustic features of a received audio signal from a caller of a communication device;
identifying, based on a previously recorded time and a previously recorded speed of the caller of the communication device, in combination with the acoustic features, a repeating pattern of meta-data associated with the acoustic features;
classifying a background environment of the caller based on the acoustic features and the repeating pattern of meta-data, to yield a background environment classification;
prompting the caller to perform one of;
speaking more slowly, speaking more clearly, and moving to a quieter location based on the background environment classification;
selecting an acoustic model matched to the background environment classification from a plurality of acoustic models, each of the plurality of acoustic models being generated for a particular predefined background environment classification;
and performing speech recognition on the received audio signal using the acoustic model.
5 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
24 Citations
22 Claims
-
1. A method comprising:
-
analyzing acoustic features of a received audio signal from a caller of a communication device; identifying, based on a previously recorded time and a previously recorded speed of the caller of the communication device, in combination with the acoustic features, a repeating pattern of meta-data associated with the acoustic features; classifying a background environment of the caller based on the acoustic features and the repeating pattern of meta-data, to yield a background environment classification; prompting the caller to perform one of;
speaking more slowly, speaking more clearly, and moving to a quieter location based on the background environment classification;selecting an acoustic model matched to the background environment classification from a plurality of acoustic models, each of the plurality of acoustic models being generated for a particular predefined background environment classification; and performing speech recognition on the received audio signal using the acoustic model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, result in the processor performing operations comprising; analyzing acoustic features of a received audio signal from a caller of a communication device; identifying, based on a previously recorded time and a previously recorded speed of the caller of the communication device, in combination with the acoustic features, a repeating pattern of meta-data associated with the acoustic features; classifying a background environment of the caller based on the acoustic features and the repeating pattern of meta-data to yield a background environment classification; prompting the caller to perform one of;
speaking more slowly, speaking more clearly, and moving to a quieter location based on the background environment classification;selecting an acoustic model matched to the background environment classification from a plurality of acoustic models, each of the plurality of acoustic models being generated for a particular background environment; and performing speech recognition on the received audio signal using the acoustic model. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
analyzing acoustic features of a received audio signal from a caller of a communication device; identifying, based on a previously recorded time and a previously recorded speed of the caller of the communication device, in combination with the acoustic features, a repeating pattern of meta-data associated with the acoustic features; classifying a background environment of the caller based on the acoustic features and the repeating pattern of meta-data, to yield a background environment classification; prompting the caller to perform one of;
speaking more slowly, speaking more clearly, and moving to a quieter location based on the background environment classification;selecting an acoustic model matched to the background environment classification from a plurality of acoustic models, each of the plurality of acoustic models being generated for a particular background environment classification; and performing speech recognition as the received audio signal using the acoustic model.
-
Specification