Identification of candidate training utterances from human conversations with an intelligent interactive assistant
First Claim
1. A method for creating a binary classification model, the method comprising:
- receiving a plurality of intents;
for each intent, receiving a plurality of associated training utterances;
combining all of the received training utterances into a training utterance collection;
generating all n-grams included in all of the training utterances;
assigning each n-gram a unique numeric identifier;
for a first intent, included in the plurality of intents;
assigning an entry for each of the training utterances included in the training utterance collection, wherein;
each entry comprises the unique identifiers of each n-gram included in the training utterances;
each unique identifier, included in each entry, is accompanied by a number of times the n-gram appears in the training utterance; and
each entry is assigned either a “
one”
notation or a “
zero”
notation, the “
one”
notation indicating that the training utterance is associated with the first intent, the “
zero”
notation indicating that the training utterance is disassociated from the first intent;
converting, by a support vector machine (“
SVM”
), each entry into a vector representation;
separating, by the SVM, the vector representations into two groups, a first group and a second group, the first group being identified by the vector representations that are assigned the “
one”
notation, the second group being identified by the vector representations that are assigned to the “
zero”
notation;
defining, by the SVM, a vector representation of a line of demarcation between the vector representations of the first group and the vector representations of the second group;
creating a binary classification model for the first intent, said binary classification model comprising;
the first group of vector representations;
the second group of vector representations; and
the vector representation of the line of demarcation.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for creating binary classification models and using the binary classification models to select candidate training utterances from a plurality of live utterances is provided. The method may include receiving a plurality of intents and associated training utterances. The method may include creating, from the training utterances, a binary classification model for each intent. The binary classification model may include a vector representation of a line of demarcation between utterances associated with the intent and utterances disassociated from the intent. The method may also include receiving live utterances. An intent may be determined for each live utterance. The method may include creating a vector representation of the live utterance. The method may include selecting candidate training utterances based on a comparison between the vector representation of the live utterance and the vector representation included in the binary classification model of the intent determined for the live utterance.
13 Citations
14 Claims
-
1. A method for creating a binary classification model, the method comprising:
-
receiving a plurality of intents; for each intent, receiving a plurality of associated training utterances; combining all of the received training utterances into a training utterance collection; generating all n-grams included in all of the training utterances; assigning each n-gram a unique numeric identifier; for a first intent, included in the plurality of intents; assigning an entry for each of the training utterances included in the training utterance collection, wherein; each entry comprises the unique identifiers of each n-gram included in the training utterances; each unique identifier, included in each entry, is accompanied by a number of times the n-gram appears in the training utterance; and each entry is assigned either a “
one”
notation or a “
zero”
notation, the “
one”
notation indicating that the training utterance is associated with the first intent, the “
zero”
notation indicating that the training utterance is disassociated from the first intent;converting, by a support vector machine (“
SVM”
), each entry into a vector representation;separating, by the SVM, the vector representations into two groups, a first group and a second group, the first group being identified by the vector representations that are assigned the “
one”
notation, the second group being identified by the vector representations that are assigned to the “
zero”
notation;defining, by the SVM, a vector representation of a line of demarcation between the vector representations of the first group and the vector representations of the second group; creating a binary classification model for the first intent, said binary classification model comprising; the first group of vector representations; the second group of vector representations; and the vector representation of the line of demarcation. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for creating a binary classification model, the apparatus comprising:
-
a receiver operable to receive; a plurality of intents; for each intent, a plurality of associated training utterances; a processor operable to; combine all of the received training utterances into a training utterance collection; generate all n-grams included in all of the training utterances; assign each generated n-gram a unique numeric identifier; for a first intent, included in the plurality of intents; assign an entry for each of the training utterances included in the training utterance collection, wherein each entry comprises the unique identifiers of each n-gram, wherein each unique identifier within each entry is accompanied by a number of times the n-gram appears in training utterance, wherein each entry is assigned either a “
one”
notation or a “
zero”
notation, the “
one”
notation indicating that the training utterance is associated with the first intent, the “
zero”
notation indicating that the training utterance is disassociated from the first intent;a support vector machine (“
SVM”
) operable to;convert each entry into a vector representation; separate the vector representations into two groups, a first group and a second group, the first group being identified by the vector representations that are assigned the “
one”
notation, the second group being identified by the vector representations that are assigned the “
zero”
notation;determine a line of demarcation between the vector representations of the first group and the vector representations of the second group; determine a vector representation of the line or demarcation; create a binary classification model, said binary classification model comprising; the first group of vector representations; the second group of vector representations; and the vector representation of the line of demarcation. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification