Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects

US 10,095,778 B2
Filed: 07/06/2015
Issued: 10/09/2018
Est. Priority Date: 09/27/2005
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of operating a computerized search engine to identify and rank relevant documents from a corpus comprising multiple millions of citationally-related source documents, said computer-implemented method comprising:

storing on a computer-readable storage device of the computerized search engine a search index comprising a first set of identification information identifying potential input documents selected from said source documents and, for each said potential input document, a second set of identification information identifying a selected number of citationally-related potential output documents selected from said source documents;

calculating, via one or more computer-processors coupled to said computer-readable storage device, a first numerical score that is statistically correlated to the probability that a direct citation exists between each corresponding pair of citationally-related potential input document and potential output document and wherein said first numerical score is calculated based at least in part on how many indirect citations exist between each said pair of citationally related documents and, for each indirect citation, how many citation links separate each said pair of citationally-related documents;

storing said first numerical score for each said pair of citationally related documents on said computer-readable storage device in association with said search index;

receiving a search query comprising a third set of identification information identifying one or more input documents selected from said source documents;

using said third set of identification information to ascertain from said search index, via said one or more computer-processors, a fourth set of identification information identifying, for each of said one or more input documents, a selected number of corresponding output documents and, for each pair of input document and corresponding output document, said first numerical score;

calculating, responsive to receiving said search query, via said one or more computer-processors, a second numerical score that is statistically correlated to the probability that a direct citation exists between any of said one or more input documents and each of said corresponding output documents, and wherein said second numerical score is calculated based at least in part on said first numerical score;

generating, via said one or more computer-processors, a search query result set comprising identification information identifying one or more of said output documents and wherein said search query result set is sorted or ranked in accordance with said second numerical score; and

storing said search query result set on said computer-readable storage device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment a method for probabilistically quantifying a degree of relevance between two or more citationally or contextually related data objects, such as patent documents, non-patent documents, web pages, personal and corporate contacts information, product information, consumer to behavior, technical or scientific information, address information, and the like is provided. In another embodiment a method for visualizing and displaying relevance between two or more citationally or contextually related data objects is provided. In another embodiment a search input/output interface that utilizes an iterative self-organizing mapping technique to automatically generate a visual map of relevant patents and/or other related documents desired to be explored, searched or analyzed is provided. In another embodiment, a search input/output interface that displays and/or communicates search input criteria and corresponding search results in a way that facilitates intuitive understanding and visualization of the logical relationships between two or more related concepts being searched is provided.

183 Citations

18 Claims

1. A computer-implemented method of operating a computerized search engine to identify and rank relevant documents from a corpus comprising multiple millions of citationally-related source documents, said computer-implemented method comprising:
- storing on a computer-readable storage device of the computerized search engine a search index comprising a first set of identification information identifying potential input documents selected from said source documents and, for each said potential input document, a second set of identification information identifying a selected number of citationally-related potential output documents selected from said source documents;
  
  calculating, via one or more computer-processors coupled to said computer-readable storage device, a first numerical score that is statistically correlated to the probability that a direct citation exists between each corresponding pair of citationally-related potential input document and potential output document and wherein said first numerical score is calculated based at least in part on how many indirect citations exist between each said pair of citationally related documents and, for each indirect citation, how many citation links separate each said pair of citationally-related documents;
  
  storing said first numerical score for each said pair of citationally related documents on said computer-readable storage device in association with said search index;
  
  receiving a search query comprising a third set of identification information identifying one or more input documents selected from said source documents;
  
  using said third set of identification information to ascertain from said search index, via said one or more computer-processors, a fourth set of identification information identifying, for each of said one or more input documents, a selected number of corresponding output documents and, for each pair of input document and corresponding output document, said first numerical score;
  
  calculating, responsive to receiving said search query, via said one or more computer-processors, a second numerical score that is statistically correlated to the probability that a direct citation exists between any of said one or more input documents and each of said corresponding output documents, and wherein said second numerical score is calculated based at least in part on said first numerical score;
  
  generating, via said one or more computer-processors, a search query result set comprising identification information identifying one or more of said output documents and wherein said search query result set is sorted or ranked in accordance with said second numerical score; and
  
  storing said search query result set on said computer-readable storage device.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer-implemented method of claim 1 wherein said search index comprises, for each said potential input document, identification information identifying each citationally-related potential output document extending at least three generations and no more than five generations from each said potential input document.
  - 3. The computer-implemented method of claim 1 wherein calculating said second numerical score comprises calculating, for each output document, the mathematical sum of said first numerical score for each corresponding identified pair of citationally-related input document and output document.
  - 4. The computer-implemented method of claim 1 wherein said search query result set is aggregated by output document.
  - 5. The computer-implemented method of claim 1 further comprising receiving a second search query comprising a fifth set of identification information identifying one or more input documents selected from said search query result set.
  - 6. The computer-implemented method of claim 1 further comprising visually displaying said search query result set in the form of an interactive chart, graph, or map.

7. A computerized search engine for identifying and ranking relevant documents from a corpus comprising multiple millions of citationally-related source documents, said computerized search engine comprising:
- a computer-accessible storage device containing a search index configured to identify one or more of said source documents that may be relevant to an input search query, said search index comprising;
  
  (i) a first set of identification information identifying potential input documents that may be selected from said source documents;
  
  (ii) a second set of identification information identifying, for each said potential input document, a selected number of citationally-related potential output documents selected from said source documents; and
  
  (iii) a first numerical score that is statistically correlated to the probability that a direct citation exists between each corresponding pair of citationally-related potential input document and potential output document and wherein said first numerical score is calculated based at least in part on how many indirect citations exist between each said pair of citationally related documents and, for each indirect citation, how many citation links separate each said pair of citationally-related documents; and
  
  a computer, communicatively coupled to said computer-accessible storage device, and programed to;
  
  (i) receive said input search query comprising a third set of identification information identifying one or more input documents selected from said source documents;
  
  (ii) use said third set of identification information to ascertain from said search index a fourth set of identification information identifying, for each of said one or more input documents, a selected number of corresponding output documents and, for each pair of input document and output document, said first numerical score;
  
  (iii) calculate, responsive to receiving said input search query, a second numerical score that is statistically correlated to the probability that a direct citation exists between any of said one or more input documents and each of said output documents, and wherein said second numerical score is calculated based at least in part on said first numerical score;
  
  (iv) generate a search query result set identifying one or more of said output documents and wherein said search query result set is sorted in accordance with said second numerical score; and
  
  (v) store said search query result set in said computer-accessible storage device.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computerized search engine of claim 7 wherein said search index comprises, for each said potential input document, identification information identifying each citationally-related potential output document extending at least three generations and not more than five generations from each said potential input document.
  - 9. The computerized search engine of claim 7 wherein said computer is programmed to calculate said second numerical score for each output document by calculating the mathematical sum of said first numerical score for each corresponding pair of input document and output document.
  - 10. The computerized search engine of claim 7 wherein said computer further comprises a user interface that enables a user to input a second input search query comprising a fifth set of identification information identifying one or more input documents selected from said search query result set.
  - 11. The computerized search engine of claim 10 wherein said user interface further visually displays said search query result set in the form of an interactive self-organizing map.
  - 12. The computerized search engine of claim 7 wherein said computer further comprises a user interface that visually displays said search query result set in the form of an interactive self-organizing map.

13. A computer-implemented method of operating a computerized search engine to identify and rank relevant documents from a corpus of citationally-related source documents, said computer-implemented method comprising the following steps executed, in order, by a computing device configured with specific computer-executable instructions:
- (1) receiving an input search query comprising a first set of identification information identifying a set of input documents selected from said corpus;
  
  (2) using said first set of identification information to ascertain, from a search index stored on an associated data storage device, a second set of identification information identifying a set of output documents selected from said corpus, said search index comprising;
  
  (a) a third set of identification information identifying potential input documents that may be selected from said corpus;
  
  (b) a fourth set of identification information identifying, for each said potential input document, a selected number of citationally-related potential output documents selected from said corpus; and
  
  (c) a first numerical score that is statistically correlated to the probability that a direct citation exists between each corresponding pair of citationally-related potential input document and potential output document, and wherein said first numerical score is calculated based on how many indirect citations exist between each said pair of citationally related documents and, for each indirect citation, how many citation links separate each said pair of citationally-related documents;
  
  (3) calculating, responsive to receiving said input search query, a second numerical score that is statistically correlated to the probability that a direct citation exists between any document comprising said set of input documents and each corresponding document comprising said set of output documents, and wherein said second numerical score is calculated based at least in part on said first numerical score;
  
  (4) generating a search query result set comprising identification information identifying one or more of said output documents and wherein said search query result set is sorted or ranked in accordance with said second numerical score; and
  
  (5) storing said search query result set.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The computer-implemented method of claim 13 wherein said search index comprises, for each said potential input document, identification information identifying each citationally-related potential output document extending at least three generations and no more than five generations from each said potential input document.
  - 15. The computer-implemented method of claim 13 wherein calculating said second numerical score comprises calculating, for each output document, the mathematical sum of said first numerical score for each corresponding identified pair of citationally-related input document and output document.
  - 16. The computer-implemented method of claim 13 wherein said search query result set is aggregated by output document.
  - 17. The computer-implemented method of claim 13 further comprising receiving a second search query comprising a fifth set of identification information identifying one or more input documents selected from said search query result set.
  - 18. The computer-implemented method of claim 13 further comprising visually displaying said search query result set in the form of an interactive chart, graph, or map.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
PatentRatings, LLC
Original Assignee
PatentRatings, LLC
Inventors
Barney, Jonathan A.
Primary Examiner(s)
Jami, Hares

Application Number

US14/792,214
Publication Number

US 20160004768A1
Time in Patent Office

1,191 Days
Field of Search

707705, 707722, 707726, 707728, 707731, 707923, 707930, 707933, 707937
US Class Current
CPC Class Codes

G06F 16/14   Details of searching files ...

G06F 16/2228   Indexing structures

G06F 16/24578   using ranking

G06F 16/2465   Query processing support fo...

G06F 16/248   Presentation of query results

G06F 16/26   Visual data mining; Browsin...

G06F 16/334   Query execution G06F16/335 ...

G06F 16/3346   using probabilistic model

G06F 16/34   Browsing; Visualisation the...

G06F 16/382   using citations hypermedia ...

G06F 16/93   Document management systems

G06F 16/95   Retrieval from the web

G06F 16/951   Indexing; Web crawling tech...

G06F 2216/11   Patent retrieval

Y10S 707/912   Applications of a database

Y10S 707/923   Intellectual property

Y10S 707/93   intellectual property analysis

Y10S 707/933   Citation analysis

Y10S 707/937   intellectual property searc...

Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

183 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for probabilistically quantifying and visualizing relevance between two or more citationally or contextually related data objects

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

183 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links