Methods and systems to train classification models to classify conversations

US 10,409,913 B2
Filed: 10/01/2015
Issued: 09/10/2019
Est. Priority Date: 10/01/2015
Status: Active Grant

First Claim

Patent Images

1. A method for training a conversation classification model, the method comprising:

receiving, by a transceiver, a first set of conversations corresponding to a source domain and a second set of conversations corresponding to a target domain,wherein each conversation in the first set of conversations has one or more predetermined tags,wherein at least one of the one or more predetermined tags corresponds to a status of the first set of conversations,wherein the source domain corresponds to a first technical or business field for which the one or more predetermined tags are associated and the target domain correspond to a second technical or business field, different from the first technical or business field, for which tags are not associated, andwherein each conversation in the first set of conversations and each conversation in the second set of conversations comprises an audio conversation;

generating, by one or more processors, a transcript for each conversation in the first set of conversations and a transcript for each conversation in the second set of conversations based on a speech-to-text conversion technique;

extracting, by the one or more processors, one or more features from the transcript of each of the first set of conversations and the second set of conversations;

assigning, by the one or more processors, a first weight to each conversation in the first set of conversations based on at least a similarity between content of the first set of conversations and content of the second set of conversations, wherein the similarity of the content is determined based on the one or more features extracted from the transcripts of the first set of conversations and the second set of conversations, and based on a ratio defined as;

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems for training a conversation-classification model are disclosed. A first set of conversations in a source domain and a second set of conversation in a target domain are received. Each of the first set of conversations has an associated predetermined tag. One or more features are extracted from the first set of conversations and from the second set of conversations. Based on the similarity of content in the first set of conversations and the second set of conversations, a first weight is assigned to each conversation of the first set of conversations. Further, a second weight is assigned to the one or more features of the first set of conversations based on the similarity of the one or more features of the first set of conversations and of the second set of conversations. A conversation-classification model is trained based on the first weight and the second weight.

29 Citations

View as Search Results

16 Claims

1. A method for training a conversation classification model, the method comprising:
- receiving, by a transceiver, a first set of conversations corresponding to a source domain and a second set of conversations corresponding to a target domain,wherein each conversation in the first set of conversations has one or more predetermined tags,wherein at least one of the one or more predetermined tags corresponds to a status of the first set of conversations,wherein the source domain corresponds to a first technical or business field for which the one or more predetermined tags are associated and the target domain correspond to a second technical or business field, different from the first technical or business field, for which tags are not associated, andwherein each conversation in the first set of conversations and each conversation in the second set of conversations comprises an audio conversation;
  
  generating, by one or more processors, a transcript for each conversation in the first set of conversations and a transcript for each conversation in the second set of conversations based on a speech-to-text conversion technique;
  
  extracting, by the one or more processors, one or more features from the transcript of each of the first set of conversations and the second set of conversations;
  
  assigning, by the one or more processors, a first weight to each conversation in the first set of conversations based on at least a similarity between content of the first set of conversations and content of the second set of conversations, wherein the similarity of the content is determined based on the one or more features extracted from the transcripts of the first set of conversations and the second set of conversations, and based on a ratio defined as;
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the first set of conversations and the second set of conversations comprises text conversations.
  - 3. The method of claim 1, further comprising identifying, by the one or more processors, one or more conversations from the first set of conversations based on the determined similarity between the first set of conversations and the second set of conversations, wherein a value of the first weight assigned to the one or more conversations is higher in comparison to the first weight assigned to other conversations in the first set of conversations.
  - 4. The method of claim 1, wherein the one or more features comprise at least a count of n-gram words in a conversation, a position of a segment in a thread, a position of a segment in a message, a sender of a message, an email of said sender, a count of letters in uppercase, a count of punctuations in the conversation, a measure of positive sentiment, and a measure of a negative sentiment.
  - 5. The method of claim 1, wherein the second weight is assigned to the one or more features such that a feature of the first set of conversations similar to a feature of the second set of conversations is assigned a higher value in comparison to other features in the one or more features of the first set of conversations.
  - 6. The method of claim 1, wherein the at least one of the one or more predetermined tags corresponds to at least one of an open category, a solved category, a closed category or a change channel category.
  - 7. The method of claim 1, wherein each conversation of the second set of conversations of the target domain is not assigned the one or more predetermined tags.
  - 8. The method of claim 1, further comprising transmitting, by the one or more processors, a notification to a first user in the conversation, through a user-computing device, based on the at least one of the one or more predetermined tags to the new conversations in the second set of conversations, wherein the notification corresponds to a recommendation of an action to be performed by the first user.

9. A system for training a conversation classification model, said system comprising:
- a transceiver configured to;
  
  receive a first set of conversations corresponding to a source domain and a second set of conversations corresponding to a target domain,wherein each conversation in the first set of conversations has one or more predetermined tags,wherein at least one of the one or more predetermined tags corresponds to a status of the first set of conversations,wherein the source domain corresponds to a first technical or business field for which the one or more predetermined tags are associated and the target domain correspond to a second technical or business field, different from the first technical or business field, for which tags are not associated, andwherein each conversation in the first set of conversations and each conversation in the second set of conversations comprises an audio conversation; and
  
  one or more processors configured to;
  
  generate a transcript for each conversation in the first set of conversations and the second set of conversations based on a speech-to-text conversion technique;
  
  extract one or more features from the transcript of the first set of conversations and from the transcript of the second set of conversations,assign a first weight to each conversation in the first set of conversations based on at least a similarity between content of the first set of conversations and content of the second set of conversations, wherein the similarity of the content is determined based on the one or more features extracted from the transcript of the first set of conversations and from the transcript of the second set of conversations, and based on a ratio defined as;
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The system of claim 9, wherein the first set of conversations and the second set of conversations comprises text conversations.
  - 11. The system of claim 10, wherein the one or more processors are further configured to:
    - identify one or more conversations from the first set of conversations based on the determined similarity between the first set of conversations and the second set of conversations, wherein a value of the first weight assigned to the one or more conversations is higher in comparison to the first weight assigned to other conversations in the first set of conversations.
  - 12. The system of claim 9, wherein the one or more features comprise at least a count of n-gram words in a conversation, a position of a segment in a thread, a position of a segment in a message, a sender of a message, an email of the sender, a count of letters in uppercase, a count of punctuations in the conversation, a measure of positive sentiment, and a measure of negative sentiment.
  - 13. The system of claim 9, wherein the second weight is assigned to the one or more features such that a feature of the first set of conversations similar to a feature of the second set of conversations is assigned a higher value in comparison to other features in the one or more features of the first set of conversations.
  - 14. The system of claim 9, wherein the at least one of the one or more predetermined tags corresponds to at least one of an open category, a solved category, a closed category or a change channel category.
  - 15. The system of claim 9, wherein each conversation of the second set of conversations of the target domain is not assigned the one or more predetermined tags.

16. A computer program product for use with a computing device, the computer program product comprising a non-transitory computer readable medium, the non-transitory computer readable medium storing a computer program code for training a conversation classification model, the computer program code being executable by one or more processors in the computing device to:
- receive a first set of conversations corresponding to a source domain and a second set of conversations corresponding to a target domain,wherein each conversation in the first set of conversations has one or more predetermined tags,wherein at least one of the one or more predetermined tags corresponds to a status of the first set of conversations,wherein the source domain corresponds to a first technical or business field for which the one or more predetermined tags are associated and the target domain correspond to a second technical or business field, different from the first technical or business field, for which tags are not associated, andwherein each conversation in the first set of conversations and each conversation in the second set of conversations comprises an audio conversation;
  
  generate a transcript for each conversation in the first set of conversations and each conversation in the second set of conversations based on a speech-to-text conversion technique;
  
  extract one or more features from the transcripts of each of the first set of conversations and the second set of conversations;
  
  assign a first weight to each conversation in the first set of conversations based on at least a similarity between content of the first set of conversations and content of the second set of conversations, wherein the similarity of the content is determined based on the extracted one or more features of each of the first set of conversations and the second set of conversations, and based on a ratio defined as;

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Conduent Business Services, LLC (Conduent, Inc.)
Original Assignee
Conduent Business Services, LLC (Conduent, Inc.)
Inventors
Bhatt, Himanshu Sharad, Roy, Shourya, Patra, Tanmoy
Primary Examiner(s)
Baker, Matthew H

Application Number

US14/872,258
Publication Number

US 20170098443A1
Time in Patent Office

1,440 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/35   Clustering; Classification

G06F 40/30   Semantic analysis

G10L 15/26   Speech to text systems G10L...

Methods and systems to train classification models to classify conversations

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

29 Citations

16 Claims

Specification

Use Cases

Quick Links

Others

Methods and systems to train classification models to classify conversations

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

16 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others