×

Clustering documents based on common document selections

  • US 8,650,196 B1
  • Filed: 09/30/2011
  • Issued: 02/11/2014
  • Est. Priority Date: 09/30/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method, performed by one or more server devices, the method comprising:

  • receiving, by at least one of one or more server devices, first navigation information identifying a first set of documents that are selected after a first document is provided,the first navigation information identifying a first plurality of documents, of the first set of documents, that are selected,each of the first plurality of documents being selected after the first document is provided, andeach of the first plurality of documents being selected based on information associated with the first document, andthe first navigation information including information identifying a quantity of selections of the first plurality of documents after the first document is provided;

    receiving, by at least one of the one or more server devices, second navigation information identifying a second set of documents that are selected after a second document is provided,the second navigation information identifying a second plurality of documents, of the second set of documents, that are selected,each of the second plurality of documents being selected after the second document is provided, andeach of the second plurality of documents being selected based on information associated with the second document;

    generating, by at least one of the one or more server devices, a first data structure that includes information associating the first document with the first navigation information;

    generating, by at least one of the one or more server devices, a second data structure that includes information associating the second document with the second navigation information;

    comparing, by at least one of the one or more server devices and using the first data structure and the second data structure, the first set of documents to the second set of documents;

    generating, by at least one of the one or more server devices, a similarity score based on the comparing and based on the information identifying the quantity of selections of the first plurality of documents after the first document is provided;

    determining, by at least one of the one or more server devices, based on the similarity score, that the first document is similar to the second document; and

    generating, by at least one of the one or more server devices and based on determining that the first document is similar to the second document, a cluster that includes identification information identifying the first document and the second document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×