Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects
First Claim
1. A computer-implemented method for rapidly identifying and ranking relevant documents, said method comprising:
- receiving, by a computer system comprising one or more computing devices, a first set of identification information identifying one or more input documents for which relevant output documents are sought, wherein the one or more input documents are identified from a body of data, said body of data comprising identification information identifying multiple millions of citationally related documents;
identifying, by said computer system, a second set of identification information identifying one or more output documents from said body of data that are citationally related to said one or more input documents through one or more direct or indirect citations;
determining, by said computer system, a first numerical score that statistically correlates to a probability that a direct citation exists between each input document relative to each citationally related output document, said first numerical score being determined based at least in part on how many indirect citations exist between each input document and each output document and, for each indirect citation, how many citation links separate each input document from each output document;
determining, by said computer system, a second numerical score that statistically correlates to a probability that a direct citation exists between any input document relative to each output document, said second numerical score being determined based at least in part on said first numerical score;
ranking, by said computer system, said one or more output documents in accordance with said second numerical score; and
displaying, by said computer system, a third set of identification information identifying a selected number of said one or more output documents selected or ranked in accordance with said second numerical score.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment a method for probabilistically quantifying a degree of relevance between two or more citationally or contextually related data objects, such as patent documents, non-patent documents, web pages, personal and corporate contacts information, product information, consumer to behavior, technical or scientific information, address information, and the like is provided. In another embodiment a method for visualizing and displaying relevance between two or more citationally or contextually related data objects is provided. In another embodiment a search input/output interface that utilizes an iterative self-organizing mapping technique to automatically generate a visual map of relevant patents and/or other related documents desired to be explored, searched or analyzed is provided. In another embodiment, a search input/output interface that displays and/or communicates search input criteria and corresponding search results in a way that facilitates intuitive understanding and visualization of the logical relationships between two or more related concepts being searched is provided.
-
Citations
19 Claims
-
1. A computer-implemented method for rapidly identifying and ranking relevant documents, said method comprising:
-
receiving, by a computer system comprising one or more computing devices, a first set of identification information identifying one or more input documents for which relevant output documents are sought, wherein the one or more input documents are identified from a body of data, said body of data comprising identification information identifying multiple millions of citationally related documents; identifying, by said computer system, a second set of identification information identifying one or more output documents from said body of data that are citationally related to said one or more input documents through one or more direct or indirect citations; determining, by said computer system, a first numerical score that statistically correlates to a probability that a direct citation exists between each input document relative to each citationally related output document, said first numerical score being determined based at least in part on how many indirect citations exist between each input document and each output document and, for each indirect citation, how many citation links separate each input document from each output document; determining, by said computer system, a second numerical score that statistically correlates to a probability that a direct citation exists between any input document relative to each output document, said second numerical score being determined based at least in part on said first numerical score; ranking, by said computer system, said one or more output documents in accordance with said second numerical score; and displaying, by said computer system, a third set of identification information identifying a selected number of said one or more output documents selected or ranked in accordance with said second numerical score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-system for rapidly identifying and ranking relevant documents from a body of citationally related documents, said computer system comprising:
-
a computer-accessible index, stored in a physical data store, comprising identification information identifying multiple potential input documents from said body of citationally related documents and, for each said potential input document, identification information identifying a selected number of citationally related potential output documents from said body of citationally related documents, said computer-accessible index further comprising for each possible pair of citationally related potential input document and potential output document a first numerical score that is statistically correlated to the probability that a direct citation exists between said corresponding pair of citationally related documents; wherein said first numerical score is determined based at least in part on how many indirect citations exist between each potential input document and each potential output document and, for each indirect citation, how many citation links separate each potential input document from each potential output document; an input interface configured to enable a user to select a first set of identification information identifying one or more input documents from said body of citationally related documents for which relevant output documents are sought; a computer processor configured to; access, from said computer-accessible index, said first set of identification information to identify a selection of citationally related output documents; and calculate, for each identified output document, a second numerical score that is statistically correlated to the probability that a direct citation exists between any input document and each said corresponding output document, and wherein said second numerical score is determined based at least in part on said first numerical score; and an output interface configured to display a second set of identification information identifying a selected number of said identified output documents selected or ranked in accordance with said second numerical score. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification