×

Computer-implemented system and method for visually suggesting classification for inclusion-based cluster spines

  • US 9,542,483 B2
  • Filed: 04/28/2014
  • Issued: 01/10/2017
  • Est. Priority Date: 07/28/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented system for visually suggesting classification for inclusion-based document cluster spines, comprising:

  • a non-transitory computer readable storage medium comprising program code; and

    a computer processor configured coupled to the storage medium, wherein the processor is configured to execute the program code to perform steps to;

    designate a set of reference documents each associated with a classification code;

    obtain a different set of uncoded documents;

    combine one or more of the coded reference documents with a plurality of uncoded documents into a combined document set;

    group the documents in the combined document set into clusters;

    organize the clusters along one or more spines, each spine comprising a vector;

    provide a visual suggestion for assigning one of the classification codes to one of the spines comprising visually representing each of the reference concepts in the clusters along that spine;

    identify one of the documents as a center of one of the clusters;

    generate a score vector for the cluster center;

    compare the score vector for the cluster center to score vectors associated with one or more of the reference documents;

    identify a neighborhood of similar reference documents for the cluster based on the comparison; and

    assign one of the classification codes to the cluster based on the neighborhood, comprising;

    determine a distance between the cluster center and the reference documents in the neighborhood; and

    generate the classification code for assignment to the cluster, comprising at least one of;

    identify the reference document with the closest distance to the cluster center and assign the classification code of the reference document with the closest distance as the generated classification code for the cluster;

    calculate an average of the distances between the cluster center and the reference documents associated with each of the classification codes and assign the classification code with the closest average distance as the generated classification code of the cluster; and

    count the reference documents in the neighborhood for each of the classification codes, weigh each count based on the distance between the reference documents with the classification code and the cluster center, and assign the classification code with the highest weighted count as the generated classification code of the cluster.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×