×

Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects

  • US 10,095,778 B2
  • Filed: 07/06/2015
  • Issued: 10/09/2018
  • Est. Priority Date: 09/27/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of operating a computerized search engine to identify and rank relevant documents from a corpus comprising multiple millions of citationally-related source documents, said computer-implemented method comprising:

  • storing on a computer-readable storage device of the computerized search engine a search index comprising a first set of identification information identifying potential input documents selected from said source documents and, for each said potential input document, a second set of identification information identifying a selected number of citationally-related potential output documents selected from said source documents;

    calculating, via one or more computer-processors coupled to said computer-readable storage device, a first numerical score that is statistically correlated to the probability that a direct citation exists between each corresponding pair of citationally-related potential input document and potential output document and wherein said first numerical score is calculated based at least in part on how many indirect citations exist between each said pair of citationally related documents and, for each indirect citation, how many citation links separate each said pair of citationally-related documents;

    storing said first numerical score for each said pair of citationally related documents on said computer-readable storage device in association with said search index;

    receiving a search query comprising a third set of identification information identifying one or more input documents selected from said source documents;

    using said third set of identification information to ascertain from said search index, via said one or more computer-processors, a fourth set of identification information identifying, for each of said one or more input documents, a selected number of corresponding output documents and, for each pair of input document and corresponding output document, said first numerical score;

    calculating, responsive to receiving said search query, via said one or more computer-processors, a second numerical score that is statistically correlated to the probability that a direct citation exists between any of said one or more input documents and each of said corresponding output documents, and wherein said second numerical score is calculated based at least in part on said first numerical score;

    generating, via said one or more computer-processors, a search query result set comprising identification information identifying one or more of said output documents and wherein said search query result set is sorted or ranked in accordance with said second numerical score; and

    storing said search query result set on said computer-readable storage device.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×