SYSTEM, METHOD, AND SOFTWARE FOR IDENTIFYING HISTORICALLY RELATED LEGAL CASES

US 20100125601A1
Filed: 11/16/2009
Published: 05/20/2010
Est. Priority Date: 04/04/2001
Status: Active Grant

First Claim

Patent Images

1. A computerized method implemented using a processor and memory, the method comprising:

extracting information from a first document;

retrieving a set of one or more second documents based on the extracted information;

identifying one or more of the set of second documents as more probably related to thefirst document than one or more others of the second documents using a learning machine; and

wherein identifying the one or more of the second set of documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document and processing the multi-dimensional feature vector using support-vector processing.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The American legal system, judges and lawyers are continually researching an ever-expanding body of past judicial opinions, or case law, for the ones most relevant to resolution of new disputes. To facilitate these searches, some companies collect and publish the judicial opinions of courts across the United States in both paper and electronic forms, with some of the cases containing references to prior cases from other courts that have previously ruled on all or part of the same dispute. Identifying the prior cases is problematic, because, for example, conventional computer text-matching not only suggests too many non-prior cases, but also misses too many actual prior cases. Accordingly, the present inventors devised systems, methods, and software that generally facilitate identification of one or more documents that are related to a given document, and particularly facilitate identification of prior cases for a given case. One specific embodiment retrieves prior-case candidates based on information extracted from an input case, and then uses a support vector machine to determine which of the prior-case candidates are most probably prior cases for the input case.

66 Citations

View as Search Results

28 Claims

1. A computerized method implemented using a processor and memory, the method comprising:
- extracting information from a first document;
  
  retrieving a set of one or more second documents based on the extracted information;
  
  identifying one or more of the set of second documents as more probably related to thefirst document than one or more others of the second documents using a learning machine; and
  
  wherein identifying the one or more of the second set of documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document and processing the multi-dimensional feature vector using support-vector processing.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein the learning machine comprises a support vector machine.
  - 3. The method of claim 1, wherein the documents are legal opinions.
  - 4. The method of claim 1, wherein extracting information from the first document comprises extracting information identifying one or more persons, places, or legal entities.

5. A computerized method for retrieving documents, the method implemented using at least one processor and memory and comprising:
- searching for a set of one or more documents based on a set of one of more queries;
  
  identifying one or more of the set of documents as more probably related to the query than one or more of other of the documents using a learning machine; and
  
  wherein identifying the one or more of the second set of documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document and processing the multi-dimensional feature vector using support-vector processing.
- View Dependent Claims (6, 7, 8, 9, 10, 11)
- - 6. The method of claim 5, wherein the learning machine comprises a support vector machine.
  - 7. The method of claim 5, wherein the documents are legal opinions.
  - 8. The method of claim 5, wherein searching for a set of one or more documents based on a query, comprises:
    - parsing an input document;
      
      defining one or more queries based results of parsing; and
      
      executing the one or more queries against one or more databases.
  - 9. The method of claim 5, wherein parsing the input document comprises:
    - identifying one or more parties in the document.
  - 10. The method of claim 5, wherein identifying one or more of the documents as more probably related to the query than one or more of other of the documents using a learning machine, comprisesdefining a set of feature vectors, with each feature vector based on information related to the query and a respective one of the set of documents;
    - andcommunicating the feature vectors to the support vector machine.
  - 11. The method of claim 5, wherein the query is based on an input document, and each of the feature vectors is based on similarity score for one or more portions of the document and the respective one of the set of documents.

12. A computerized method for identifying related documents, the method implemented using a processor and memory and comprising:
- receiving an input document;
  
  searching at least one database for a set of one or more related documents based on content of the input document;
  
  identifying one or more of the related documents as more probably related to the input document than one or more of other of the related documents using a support vector machine; and
  
  wherein identifying the one or more of the second set of documents includes defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document and processing the multi-dimensional feature vector using support-vector processing.
- View Dependent Claims (13, 14)
- - 13. The computerized method of claim 12, wherein the input document is a judicial opinion.
  - 14. The computerized method of claim 12, wherein searching the database for a set of one or more related documents based on content of the input document comprises:
    - extracting one or more party entities from the input document; and
      
      searching the database based on one or more of the party entities.

15. A system comprising:
- means, including a processor and memory, for extracting information from a first document;
  
  means, including a processor and memory, for retrieving a set of one or more second documents based on the extracted information;
  
  a learning machine for identifying one or more of the set of second documents as more probably related to the first document than one or more others of the second documents; and
  
  wherein the learning machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document.
- View Dependent Claims (16, 17, 18)
- - 16. The system of claim 15, wherein the learning machine comprises a support vector machine.
  - 17. The system of claim 15, wherein the documents are legal opinions.
  - 18. The system of claim 15, wherein the means for extracting information from the first document extracts information identifying one or more persons, places, or legal entities.

19. A computerized system for retrieving documents, comprising:
- means, including a processor and memory, for searching for a set of one or more documents based on a set of one of more queries;
  
  a learning machine for identifying one or more of the set of documents as more probably related to the query than one or more of other of the documents; and
  
  wherein the learning machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each second document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the second document to a portion of the first document.
- View Dependent Claims (20, 21, 22, 23, 24, 25)
- - 20. The system of claim 19, wherein the learning machine comprises a support vector machine.
  - 21. The system of claim 19, wherein the documents are legal opinions.
  - 22. The system of claim 19, wherein the means for searching for a set of one or more documents based on a query, comprises:
    - means for parsing an input document;
      
      means for defining one or more queries based on results of parsing; and
      
      means for executing the one or more queries against one or more databases.
  - 23. The system of claim 19, wherein the means for parsing the input document comprises means for identifying one or more parties in the document.
  - 24. The system of claim 19, further comprising means for defining a set of feature vectors, with each feature vector based on information related to the query and a respective one of the set of documents.
  - 25. The system of claim 24, wherein the query is based on an input document, and each of the feature vectors is based on similarity score for one or more portions of the document and the respective one of the set of documents.

26. A system for identifying related documents, the method system comprising:
- means, including a processor and memory, for searching at least one database for a set of one or more related documents based on content of an input document;
  
  a support vector machine for identifying one or more of the related documents as more probably related to the input document than one or more of other of the related documents; and
  
  wherein the support vector machine comprises;
  
  support-vector processor means, including a processor and memory, for defining a multi-dimensional feature vector for each related document, with the vector having a set of features including a similarity feature indicating similarity of at least a portion of the related document to a portion of the input document.
- View Dependent Claims (27, 28)
- - 27. The system of claim 26, wherein the document is a judicial opinion.
  - 28. The system of claim 26, wherein the means for searching the database for a set of one or more related documents based on content of the input document comprises:
    - means for extracting one or more party entities from the input document; and
      
      means searching the database based on one or more of the party entities.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Thomson Reuters Enterprise Centre GmbH (The Woodbridge Co. Ltd.)
Original Assignee
West Services
Inventors
Jackson, Peter, Al-Kofahi, Khalid

Granted Patent

US 7,984,053 B2
Time in Patent Office

Days
Field of Search
US Class Current

707/780
CPC Class Codes

G06F 16/30   of unstructured textual dat...

G06F 16/3334   Selection or weighting of t...

G06F 16/334   Query execution G06F16/335 ...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99935   Query augmenting and refini...

SYSTEM, METHOD, AND SOFTWARE FOR IDENTIFYING HISTORICALLY RELATED LEGAL CASES

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

66 Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM, METHOD, AND SOFTWARE FOR IDENTIFYING HISTORICALLY RELATED LEGAL CASES

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

66 Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links