Data mining by retrieving causally-related documents not individually satisfying search criteria used
First Claim
1. A computer-aided method of providing analysis in a corpus of documents comprising at least one discussion, the method comprising:
- analyzing the discussion to identify a communicative intent of an author of the discussion, including topic, an intended audience, and beginning and ending of a series of communications comprising the discussion even when the retrieved documents do not individually satisfy the search criteria used, the analysis providing a causal relationship between the series of communications.
10 Assignments
0 Petitions
Accused Products
Abstract
This patent describes a method and apparatus to automatically and accurately winnow down arbitrarily large amounts of electronic information created by a particular population of actors to only those subsets of particular interest by having a causal relationship, even when retrieved documents containing this information do not individually satisfy the search criteria used. An actor in this context is defined as any entity, single or aggregate, capable of creating, distributing, modifying, or receiving digital information. Once identified, this subset of information may, for example, be processed, analyzed, redacted, or destroyed, depending on the context of the system'"'"'s use.
-
Citations
28 Claims
-
1. A computer-aided method of providing analysis in a corpus of documents comprising at least one discussion, the method comprising:
analyzing the discussion to identify a communicative intent of an author of the discussion, including topic, an intended audience, and beginning and ending of a series of communications comprising the discussion even when the retrieved documents do not individually satisfy the search criteria used, the analysis providing a causal relationship between the series of communications. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
15. A computer-aided method comprising:
-
analyzing a discussion to identify a communicative intent of an author of the discussion, including topic, an intended audience, a beginning, and ending of a series of communications comprising the discussion even when the documents retrieved therefor do not individually satisfy the search criteria used; and determining anomalies in a corpus of documents, wherein the corpus of documents includes multiple data sets from multiple sources, and the anomalies represent substantive differences between the multiple data sets. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer-aided method comprising:
-
analyzing a discussion comprising a plurality of causally related documents to identify a communicative intent of an author of the discussion, including topic, an intended audience, a beginning, and ending of a series of communications comprising the discussion even when the retrieved documents do not individually satisfy the search criteria used; and enabling redaction of a portion of the document, by removing data from a document and replacing the removed data it with data of no value. - View Dependent Claims (27, 28)
-
Specification