×

System and method for identifying facts and legal discussion in court case law documents

  • US 6,772,149 B1
  • Filed: 09/23/1999
  • Issued: 08/03/2004
  • Est. Priority Date: 09/23/1999
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of gathering large quantities of training data from case law documents and of extracting features that are independent of specific machine learning algorithms needed to accurately classify case law text passages as fact passages or as discussion passages, the method comprising:

  • a) partitioning text passages within an opinion segment of a case law document by headings contained therein;

    b) comparing the headings in the document;

    1) to fact headings in a fact heading list, said fact headings in said fact heading list representing a specific set of predefined terms and phrases; and

    2) to discussion headings in a discussion heading list, said discussion headings in said discussion heading list representing a specific set of predefined terms and phrases;

    c) filtering from out of the document;

    1) the headings in said document that match at least one of said fact headings and said discussion headings set forth in said fact heading list and said discussion heading list, respectively; and

    2) text passages that are associated with the filtered headings;

    d) categorizing the text passages as fact training data or as discussion training data based on the filtered headings associated with said text passages, and storing the fact training data and the discussion training data on persistent storage;

    e) determining a relative position of the text passages in said opinion segment;

    f) parsing the text passages into text chunks;

    g) comparing the text chunks to predetermined feature entities for possible matched feature entities, said predetermined feature entities including at least five of;

    i) a Case Cite format;

    ii) a Statute Cite format;

    iii) entities in a Past Tense Verb list;

    iv) a Date format;

    v) entities in a Signal Word list;

    vi) entities in a This Court Phrases list;

    vii) entities in a Lower Court Phrases list;

    viii) entities in a Defendant Words list;

    ix) entities in a Plaintiff Words list; and

    x) entities in a Legal Phrases list;

    h) associating the relative position and matched feature entities with the text passages, for use by one of the learning algorithms; and

    i) classifying each of the text passages as at least one of a fact passage or a discussion passage based on the relative position and matched feature entities.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×