System, method, and software for identifying historically related legal cases

US 7,984,053 B2
Filed: 11/16/2009
Issued: 07/19/2011
Est. Priority Date: 04/04/2001
Status: Active Grant

First Claim

Patent Images

1. A computerized method implemented using a processor and memory, the method comprising:

extracting information from a first input document;

retrieving one or more second documents based on the extracted information;

identifying one or more of the second documents as more probably related to the first input document than one or more of the other second documents using a learning machine; and

wherein the step of identifying the one or more of the second documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of a second document to a portion of the first input document and processing the multi-dimensional feature vector using support-vector processing wherein each of the feature vectors is based on a title similarity score for one or more portions of a second document and a said first input document.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The American legal system, judges and lawyers are continually researching an ever-expanding body of past judicial opinions, or case law, for the ones most relevant to resolution of new disputes. To facilitate these searches, some companies collect and publish the judicial opinions of courts across the United States in both paper and electronic forms, with some of the cases containing references to prior cases from other courts that have previously ruled on all or part of the same dispute. Identifying the prior cases is problematic, because, for example, conventional computer text-matching not only suggests too many non-prior cases, but also misses too many actual prior cases. Accordingly, the present inventors devised systems, methods, and software that generally facilitate identification of one or more documents that are related to a given document, and particularly facilitate identification of prior cases for a given case. One specific embodiment retrieves prior-case candidates based on information extracted from an input case, and then uses a support vector machine to determine which of the prior-case candidates are most probably prior cases for the input case.

Citations

26 Claims

1. A computerized method implemented using a processor and memory, the method comprising:
- extracting information from a first input document;
  
  retrieving one or more second documents based on the extracted information;
  
  identifying one or more of the second documents as more probably related to the first input document than one or more of the other second documents using a learning machine; and
  
  wherein the step of identifying the one or more of the second documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of a second document to a portion of the first input document and processing the multi-dimensional feature vector using support-vector processing wherein each of the feature vectors is based on a title similarity score for one or more portions of a second document and a said first input document.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein the learning machine comprises a support vector machine.
  - 3. The method of claim 1, wherein the documents are legal opinions.
  - 4. The method of claim 1, wherein extracting information from the first input document comprises extracting information identifying one or more persons, places, or legal entities.

5. A computerized method for retrieving documents, the method implemented using at least one processor and memory and comprising:
- searching for one or more second documents based on at least one input document;
  
  identifying one or more second documents as more probably related to the at least one input document more than one or more of the other second documents using a learning machine; and
  
  wherein identifying the one or more of the second documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of a said second document to a portion of the at least one input document and processing the multi-dimensional feature vector using support-vector processing, wherein each of the feature vectors is based on a title similarity score for one or more portions of a said second document and the at least one input document.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The method of claim 5, wherein the learning machine comprises a support vector machine.
  - 7. The method of claim 5, wherein the documents are legal opinions.
  - 8. The method of claim 5, wherein searching for a set of one or more second documents based on at least one input document, comprises:
    - parsing an input document;
      
      defining one or more queries based results of parsing; and
      
      executing the one or more queries against one or more databases.
  - 9. The method of claim 5, wherein parsing the at least one input document comprises:
    - identifying one or more parties in the document.
  - 10. The method of claim 5, wherein identifying one or more of the second documents as more probably related to the at least one input document more than one or more of the other second documents using a learning machine, comprises:
    - defining a set of feature vectors, with each feature vector based on information related to the at least one input document and a respective one of the second documents; and
      
      communicating the feature vectors to the support vector machine.

11. A computerized method for identifying related documents, the method implemented using a processor and memory and comprising:
- receiving an input document;
  
  searching at least one database for a set of one or more related second documents based on content of the input document;
  
  identifying one or more of the related second documents as more probably related to the input document than one or more of the other related second documents using a support vector machine; and
  
  wherein identifying the one or more of the related second documents includes defining a multi-dimensional feature vector for each related second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the related second document to a portion of the input document and processing the multi-dimensional feature vector using support-vector processing wherein each of the feature vectors is based on a title similarity score for one or more portions of a said related second document and a said input document.
- View Dependent Claims (12, 13)
- - 12. The computerized method of claim 11, wherein the input document is a judicial opinion.
  - 13. The computerized method of claim 11, wherein searching the database for a set of one or more related second documents based on content of the input document comprises:
    - extracting one or more party entities from the input document; and
      
      searching the database based on one or more of the party entities.

14. A system comprising:
- means, including a processor and memory, for extracting information from a first input document;
  
  means, including a processor and memory, for retrieving one or more second documents based on the extracted information;
  
  a learning machine for identifying one or more of the second documents as more probably related to the first input document than one or more of the other second documents; and
  
  wherein the learning machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first input document wherein each of the feature vectors is based on a title similarity score for one or more portions of a said second document and a said first input document.
- View Dependent Claims (15, 16, 17)
- - 15. The system of claim 14, wherein the learning machine comprises a support vector machine.
  - 16. The system of claim 14, wherein the documents are legal opinions.
  - 17. The system of claim 14, wherein the means for extracting information from the first input document extracts information identifying one or more persons, places, or legal entities.

18. A computerized system for retrieving documents, comprising:
- means, including a processor and memory, for searching one or more second documents based on at least one input document;
  
  a learning machine for identifying one or more of the second documents as more probably related to the at least one input document than one or more of the other second documents; and
  
  wherein the learning machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of a said second document to a portion of the at least one input document wherein each of the feature vectors is based on a title similarity score for one or more portions of a said second document and the at least one input document.
- View Dependent Claims (19, 20, 21, 22, 23)
- - 19. The system of claim 18, wherein the learning machine comprises a support vector machine.
  - 20. The system of claim 18, wherein the documents are legal opinions.
  - 21. The system of claim 18, wherein the means for searching for a set of one or more documents based on a query, comprises:
    - means for parsing the at least one input document;
      
      means for defining one or more queries based on results of parsing; and
      
      means for executing the one or more queries against one or more databases.
  - 22. The system of claim 18, wherein the means for parsing the at least one input document comprises means for identifying one or more parties in the document.
  - 23. The system of claim 18, further comprising means for defining a set of feature vectors, with each feature vector based on information related to the at least one input document and a respective one of the second documents.

24. A system for identifying related documents, the system comprising:
- means, including a processor and memory, for searching at least one database for a set of one or more related second documents based on content of an input document;
  
  a support vector machine for identifying one or more of the related second documents as more probably related to the input document than one or more of the other related second documents; and
  
  wherein the support vector machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each related second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the related second document to a portion of the input document wherein each of the feature vectors is based on a title similarity score for one or more portions of a said related second document and the input document.
- View Dependent Claims (25, 26)
- - 25. The system of claim 24, wherein the document is a judicial opinion.
  - 26. The system of claim 24, wherein the means for searching the database for a set of one or more related documents based on content of the input document comprises:
    - means for extracting one or more party entities from the input document; and
      
      means searching the database based on one or more of the party entities.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Thomson Reuters Enterprise Centre GmbH (The Woodbridge Co. Ltd.)
Original Assignee
West Services
Inventors
Jackson, Peter, Al-Kofahi, Khalid
Primary Examiner(s)
Fleurantin; Jean B.
Assistant Examiner(s)
Ly; Anh

Application Number

US12/619,056
Publication Number

US 20100125601A1
Time in Patent Office

610 Days
Field of Search

707/736, 707/737, 707/738, 707/755, 707/770, 706/7, 706/17, 704/1, 704/2, 704/7, 715/500, 715/531
US Class Current

707/736
CPC Class Codes

G06F 16/30   of unstructured textual dat...

G06F 16/3334   Selection or weighting of t...

G06F 16/334   Query execution G06F16/335 ...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

System, method, and software for identifying historically related legal cases

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

System, method, and software for identifying historically related legal cases

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links