Conversational data mining
First Claim
1. A method for collecting, in a data warehouse, data associated with a voice of a voice system user, said method comprising the steps of:
- (a) conducting a conversation with the voice system user via at least one of a human operator and a voice-enabled machine system;
(b) capturing a speech waveform associated with utterances spoken by the voice system user during said conversation;
(c) digitizing said speech waveform to provide a digitized speech waveform;
(d) extracting, from said digitized speech waveform, at least one acoustic feature which is correlated with at least one user attribute, said at least one user attribute including at least one of;
(d-1) gender of the user;
(d-2) age of the user;
(d-3) accent of the user;
(d-4) native language of the user;
(d-5) dialect of the user;
(d-6) socioeconomic classification of the user;
(d-7) educational level of the user; and
(d-8) emotional state of the user;
(e) storing attribute data corresponding to said acoustic feature which is correlated with said at least one user attribute, together with at least one identifying indicia, in the data warehouse in a form to facilitate subsequent data mining thereon;
(f) repeating steps (a)-(e) for a plurality of additional conversations, with additional users, to provide a collection of stored data including the attribute data and identifying indicia; and
(g) mining the collection of stored data to provide information for modifying underlying business logic of the voice system.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for collecting data associated with the voice of a voice system user includes conducting a plurality of conversations with a plurality of voice system users. For each conversation, a speech waveform is captured and digitized, and at least one acoustic feature is extracted. The features are correlated with at least one attribute such as gender, age, accent, native language, dialect, socioeconomic classification, educational level and emotional state. Attribute data and at least one identifying indicia are stored for each user in a data warehouse, in a form to facilitate subsequent data mining thereon. The resulting collection of stored data is then mined to provide information for modifying underlying business logic of the voice system. An apparatus suitable for carrying out the method includes a dialog management unit, an audio capture module, an acoustic from end, a processing module and a data warehouse. Appropriate method steps can be implemented by a digital computer running a suitable program stored on a program storage device.
463 Citations
31 Claims
-
1. A method for collecting, in a data warehouse, data associated with a voice of a voice system user, said method comprising the steps of:
-
(a) conducting a conversation with the voice system user via at least one of a human operator and a voice-enabled machine system;
(b) capturing a speech waveform associated with utterances spoken by the voice system user during said conversation;
(c) digitizing said speech waveform to provide a digitized speech waveform;
(d) extracting, from said digitized speech waveform, at least one acoustic feature which is correlated with at least one user attribute, said at least one user attribute including at least one of;
(d-1) gender of the user;
(d-2) age of the user;
(d-3) accent of the user;
(d-4) native language of the user;
(d-5) dialect of the user;
(d-6) socioeconomic classification of the user;
(d-7) educational level of the user; and
(d-8) emotional state of the user;
(e) storing attribute data corresponding to said acoustic feature which is correlated with said at least one user attribute, together with at least one identifying indicia, in the data warehouse in a form to facilitate subsequent data mining thereon;
(f) repeating steps (a)-(e) for a plurality of additional conversations, with additional users, to provide a collection of stored data including the attribute data and identifying indicia; and
(g) mining the collection of stored data to provide information for modifying underlying business logic of the voice system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
(h) modifying, in real time, behavior of the voice system based on said at least one user attribute.
-
-
10. The method of claim 9, wherein said modifying in step (h) comprises at least one of:
-
real-time changing of business logic of the voice system; and
real-time modifying of the voice system response, as compared to an expected response of the voice system without said modifying.
-
-
11. The method of claim 3, further comprising the additional steps of:
-
examining said at least one emotional state feature to determine if the user is in a jovial emotional state; and
offering the user at least one of a product and a service in response to said jovial emotional state.
-
-
12. The method of claim 11, further comprising the additional steps of:
-
determining at least one user attribute other than emotional state; and
tailoring said at least one of a product and a service in response to said at least one user attribute other than emotional state.
-
-
13. The method of claim 3, further comprising the additional steps of:
-
examining said at least one emotional state feature to determine if the user is in a jovial emotional state; and
performing a marketing study on the user in response to said jovial emotional state.
-
-
14. The method of claim 13, further comprising the additional steps of:
-
determining at least one user attribute other than emotional state; and
tailoring said market study in response to said at least one user attribute other than emotional state.
-
-
15. The method of claim 3, wherein the voice system is a substantially automatic interactive voice response (IVR) system, further comprising the additional steps of:
-
examining said at least one emotional state feature to determine if the user is in at least one of a disgusted, contemptuous, fearful and angry emotional state; and
switching said user from said IVR to a human operator in response to said at least one of a disgusted, contemptuous, fearful and angry emotional state.
-
-
16. The method of claim 3, wherein the voice system is a hybrid interactive voice response (IVR) system, further comprising the additional steps of:
-
examining said at least one emotional state feature to determine if the user is in at least one of a disgusted, contemptuous, fearful and angry emotional state; and
switching said user from a low-level human operator to a higher-level human supervisor in response to said at least one of a disgusted, contemptuous, fearful and angry emotional state.
-
-
17. The method of claim 3, wherein the voice system is a substantially automatic interactive voice response (IVR) system, further comprising the additional steps of:
-
examining said at least one emotional state feature to determine if the user is in a confused emotional state; and
switching said user from said IVR to a human operator in response to said confused emotional state.
-
-
18. An apparatus for collecting data associated with a voice of a user, said apparatus comprising:
-
(a) a dialog management unit which conducts a conversation with the user;
(b) an audio capture module which is coupled to said dialog management unit and which captures a speech waveform associated with utterances spoken by the user during the conversation;
(c) an acoustic front end which is coupled to said audio capture module and which is configured to;
receive and digitize the speech waveform to provide a digitized speech waveform; and
extract, from the digitized speech waveform, at least one acoustic feature which is correlated with at least one user attribute, said at least one user attribute including at least one of;
(c-1) gender of the user;
(c-2) age of the user;
(c-3) accent of the user;
(c-4) native language of the user;
(c-5) dialect of the user;
(c-6) socioeconomic classification of the user;
(c-7) educational level of the user; and
(c-8) emotional state of the user;
(d) a processing module which is coupled to said acoustic front end and which analyzes said at least one acoustic feature to determine said at least one user attribute; and
(e) a data warehouse which is coupled to said processing module and which stores said at least one user attribute, together with at least one identifying indicia, in a form for subsequent data mining thereon;
wherein;
said dialog management unit is configured to conduct a plurality of additional conversations with additional users;
said audio capture module is configured to capture a plurality of additional speech waveforms associated with utterances spoken by said additional users during said plurality of additional conversations;
said acoustic front end is configured to receive and digitize said plurality of additional speech waveforms to provide a plurality of additional digitized speech waveforms, and is further configured to extract, from said plurality of additional digitized speech waveforms, a plurality of additional acoustic features, each correlated with at least one attribute of one of said additional users;
said processing module is configured to analyze said additional acoustic features to determine a plurality of additional user attributes;
said data warehouse is configured to store said plurality of additional user attributes, each together with at least one additional identifying indicia, in said form for said subsequent data mining; and
said processing module and said data warehouse are configured to mine the stored user attributes and identifying indicia to provide information for modifying underlying business logic of the apparatus. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
a speaker clusterer and classifier;
a speech recognizer; and
an accent identifier.
-
-
26. The apparatus of claim 25, further comprising a post processor which is coupled to said data warehouse and which is configured to transcribe user utterances and to perform keyword spotting thereon.
-
27. The apparatus of claim 18, wherein said processing module is configured to modify behavior of the apparatus, in real time, based on said at least one user attribute.
-
28. The apparatus of claim 27, wherein said processing module modifies behavior of the apparatus, at least in part, by prompting a human operator thereof.
-
29. The apparatus of claim 27, wherein said processing module comprises a processor portion of an interactive voice response (IVR) system and wherein said processor module modifies behavior of the apparatus, at least in part, by modifying business logic of the IVR.
-
30. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for collecting, in a data warehouse, data associated with a voice of a voice system user, said method steps comprising:
-
(a) reading digital data corresponding to a speech waveform associated with utterances spoken by the voice system user during a conversation between the voice system user and at least one of a human operator and a voice-enabled machine system;
(b) extracting, from said digital data, at least one acoustic feature which is correlated with at least one user attribute, said at least one user attribute including at least one of;
(b-1) gender of the user;
(b-2) age of the user;
(b-3) accent of the user;
(b-4) native language of the user;
(b-5) dialect of the user;
(b-6) socioeconomic classification of the user;
(b-7) educational level of the user; and
(b-8) emotional state of the user; and
(c) storing attribute data corresponding to said acoustic feature which is correlated with said at least one user attribute, together with at least one identifying indicia, in the data warehouse in a form to facilitate subsequent data mining thereon;
(d) repeating steps (a)-(c) for a plurality of additional conversations, with additional users, to provide a collection of stored data including suitable attribute data and identifying indicia for each conversation; and
(e) mining the collection of stored data to provide information for modifying underlying business logic of the voice system. - View Dependent Claims (31)
(f) modifying behavior of the voice system, in real time, based on said at least one user attribute.
-
Specification