×

UNSUPERVISED MESSAGE CLUSTERING

  • US 20120239650A1
  • Filed: 03/18/2011
  • Published: 09/20/2012
  • Est. Priority Date: 03/18/2011
  • Status: Active Grant
First Claim
Patent Images

1. One or more computer-storage media storing computer-useable instructions that, when executed by a computing device, perform a method for clustering messages, comprising:

  • receiving a plurality of messages, each message containing about 250 characters or less;

    parsing the messages to form message token vectors for the messages;

    filtering the parsed messages to discard at least one message from the plurality of messages;

    calculating similarity scores for the filtered plurality of messages relative to one or more message clusters, the message clusters having cluster token vectors, the similarity score being based on the message token vectors and the cluster token vectors, the similarity scores being calculated without normalization of the message token vectors relative to a length of the messages;

    adding at least one message to a message cluster based on the at least one message having a similarity score greater than a similarity threshold value; and

    updating the cluster token vector for the message cluster containing the added message.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×