×

Summarized network graph for semantic similarity graphs of large corpora

  • US 9,836,183 B1
  • Filed: 09/14/2016
  • Issued: 12/05/2017
  • Est. Priority Date: 09/14/2016
  • Status: Active Grant
First Claim
Patent Images

1. A tangible, non-transitory, machine-readable media storing instructions that when executed by one or more computers effectuate operations comprising:

  • obtaining, with one or more processors, a clustered graph, the clustered graph having three or more clusters, each cluster having a plurality of nodes of the graph, the nodes being connected in pairs by one or more respective edges, wherein obtaining the clustered graph comprises;

    ingesting a corpus of documents via a network;

    forming a graph of the documents, the graph having edges determined by ascertaining relationships between the documents based on unstructured natural language text in the documents; and

    clustering the graph with a graph processing algorithm;

    determining, with one or more processors, visual attributes of cluster icons based on amounts of nodes in clusters corresponding to the respective cluster icons;

    determining, with one or more processors, positions of the cluster icons in a graphical visualization of the clustered graph;

    obtaining, with one or more processors, for each cluster, a respective subset of nodes in the respective cluster by selecting representative or anomalous nodes in the respective cluster;

    determining, with one or more processors, visual attributes of node icons based on attributes of corresponding nodes in the subsets of nodes, each node icon representing one of the nodes in the respective subset of nodes;

    determining, with one or more processors, positions of the node icons in the graphical visualization based on the positions of the corresponding cluster icons of clusters having the nodes corresponding to the respective node icons; and

    sending, via a network, with a server, instructions causing the graphical visualization to be displayed on display of a client computing device, wherein the graphical visualization concurrently displays;

    a given cluster icon representing a given cluster among the three or more clusters; and

    a representative node icon representing a first document within the given cluster and determined to be representative of documents within the given cluster and having a first position on the display determined relative to a second position of the given cluster icon;

    oran anomalous node icon representing a second document within the given cluster and determined to be anomalous relative to a distribution of documents within the given cluster and having a third position on the display determined relative to the second position of the given cluster icon.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×