×

DATA CLUSTERING

  • US 20180336207A1
  • Filed: 05/18/2017
  • Published: 11/22/2018
  • Est. Priority Date: 05/18/2017
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method, comprising:

  • receiving a plurality of documents, each of the plurality of documents represented by a vector of words and associated with a point in time;

    dividing the received plurality of documents into first time slices using a first time interval to form a plurality of consecutive sets of documents;

    sub-dividing each of the plurality of consecutive sets of documents into second time slices using respective second time intervals to form one or more subsets of documents;

    identifying a plurality of topics in each of the plurality of consecutive sets of documents and the one or more subsets of documents, each of the plurality of topics represented by a set of most relevant topic keywords;

    clustering each of the plurality of consecutive sets of documents and the one or more subsets of documents in accordance with each of the identified plurality of topics;

    comparing each of the identified plurality of topics with respect to each of the plurality of consecutive sets of documents and the one or more subsets of documents to detect patterns of changes in the set of most relevant topic keywords over time; and

    redefining each of the clustered plurality of consecutive sets of documents and the one or more subsets of documents to form homogenous clusters based on the detected patterns.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×