Systems and methods for predictive coding
First Claim
1. A method for analyzing a plurality of documents, comprising:
- receiving the plurality of documents via a computing device;
filtering the plurality of documents to produce a subset of the plurality of documents;
executing instructions stored in memory, wherein execution of the instructions by a processor generates an initial control set based on random sampling of the subset of the plurality of documents;
receiving user input from the computing device, the user input based on an identified subject or category; and
executing instructions stored in memory, wherein execution of the instructions by a processor;
reviews the initial control set to determine at least one seed set parameter associated with the identified subject or category,automatically codes a first portion of the plurality of documents, based on the initial control set and the at least one seed set parameter associated with the identified subject or category,automatically codes a second portion of the plurality of documents resulting from an application of user analysis and an adaptive identification cycle, andadds the coded second portion of the plurality of documents to the initial control set.
9 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for analyzing documents are provided herein. A plurality of documents and user input are received via a computing device. The user input includes hard coding of a subset of the plurality of documents, based on an identified subject or category. Instructions stored in memory are executed by a processor to generate an initial control set, analyze the initial control set to determine at least one seed set parameter, automatically code a first portion of the plurality of documents based on the initial control set and the seed set parameter associated with the identified subject or category, analyze the first portion of the plurality of documents by applying an adaptive identification cycle, and retrieve a second portion of the plurality of documents based on a result of the application of the adaptive identification cycle test on the first portion of the plurality of documents.
42 Citations
28 Claims
-
1. A method for analyzing a plurality of documents, comprising:
-
receiving the plurality of documents via a computing device; filtering the plurality of documents to produce a subset of the plurality of documents; executing instructions stored in memory, wherein execution of the instructions by a processor generates an initial control set based on random sampling of the subset of the plurality of documents; receiving user input from the computing device, the user input based on an identified subject or category; and executing instructions stored in memory, wherein execution of the instructions by a processor; reviews the initial control set to determine at least one seed set parameter associated with the identified subject or category, automatically codes a first portion of the plurality of documents, based on the initial control set and the at least one seed set parameter associated with the identified subject or category, automatically codes a second portion of the plurality of documents resulting from an application of user analysis and an adaptive identification cycle, and adds the coded second portion of the plurality of documents to the initial control set. - View Dependent Claims (2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
6. A method for analyzing a plurality of documents, further comprising:
-
receiving the plurality of documents via a computing device; filtering the plurality of documents to produce a subset of the plurality of documents; executing instructions stored in memory, wherein execution of the instructions by a processor generates an initial control set based on random sampling of the subset of the plurality of documents; receiving user input from the computing device, the user input based on an identified subject or category; and executing instructions stored in memory, wherein execution of the instructions by a processor; reviews the initial control set to determine at least one seed set parameter associated with the identified subject or category, automatically codes a first portion of the plurality of documents, based on the initial control set and the at least one seed set parameter associated with the identified subject or category, analyzes the first portion of the plurality of documents by applying an adaptive identification cycle, the adaptive identification cycle being based on the initial control set, user validation of the automated coding of the first portion of the plurality of documents and confidence threshold validation, and retrieves a second portion of the plurality of documents based on a result of the application of the adaptive identification cycle on the first portion of the plurality of documents.
-
-
19. A method for analyzing a plurality of documents, comprising:
-
receiving the plurality of documents via a computing device; filtering the plurality of documents to produce a subset of the plurality of documents; executing instructions stored in memory, wherein execution of the instructions by a processor generates an initial control set based on random sampling of the subset of the plurality of documents on a rolling load basis; receiving user input from the computing device, the user input based on an identified subject or category; and executing instructions stored in memory, wherein execution of the instructions by a processor; reviews the initial control set to determine at least one seed set parameter associated with the identified subject or category, automatically codes a first portion of the plurality of documents, based on the initial control set and the at least one seed set parameter associated with the identified subject or category, automatically codes a second portion of the plurality of documents resulting from an application of user analysis and an adaptive identification cycle, and adds the coded second portion of the plurality of documents to the initial control set. - View Dependent Claims (20, 21, 22, 24, 25, 26, 27, 28)
-
-
23. A method for analyzing a plurality of documents, comprising:
-
receiving the plurality of documents via a computing device; filtering the plurality of documents to produce a subset of the plurality of documents; executing instructions stored in memory, wherein execution of the instructions by a processor generates an initial control set based on random sampling of the subset of the plurality of documents on a rolling load basis; receiving user input from the computing device, the user input based on an identified subject or category; and executing instructions stored in memory, wherein execution of the instructions by a processor; reviews the initial control set to determine at least one seed set parameter associated with the identified subject or category, automatically codes a first portion of the plurality of documents, based on the initial control set and the at least one seed set parameter associated with the identified subject or category, analyzes the first portion of the plurality of documents by applying an adaptive identification cycle, the adaptive identification cycle being based on the initial control set, user validation of the automated coding of the first portion of the plurality of documents and confidence threshold validation, and retrieves a second portion of the plurality of documents based on a result of the application of the adaptive identification cycle on the first portion of the plurality of documents.
-
Specification