SYSTEM AND METHOD FOR UTILIZING ADVANCED SEARCH AND HIGHLIGHTING TECHNIQUES FOR ISOLATING SUBSETS OF RELEVANT CONTENT DATA
First Claim
1. A method for searching through vast amounts of content data to identify relevant content data, the method comprising the steps of:
- executing a search routine based on one or more query terms constructed by an automated routine including highlighting and bookmarking techniques to retrieve a subset of responsive content data;
determining a corresponding probability of relevancy for each unit of content data in the responsive content data; and
removing from the responsive content data, one or more units of content data that do not reach a threshold probability of relevancy.
12 Assignments
0 Petitions
Accused Products
Abstract
A system and methods for utilizing advanced automated search techniques including highlighting capability for determining subsets of relevant content data (in paper or electronic form) is disclosed. These techniques are advantageous in reviewing vast collections of content data or documents to identify relevant data or documents from the collections. The advanced search techniques are based on query terms, which isolate relevant content data that respond to the query terms. A probability of relevancy can be determined for a unit of content data or document in the returned subset to facilitate exclusion of a document from the subset if it does not reach a threshold probability of relevancy. Documents in a thread of a correspondence (for example, an e-mail) in the responsive documents subset can be added to the responsive documents subset. Further, an attachment to a document in the responsive documents subset can be added to the responsive documents subset. A statistical technique is applied to determine whether remaining documents in the collection meet a predetermined acceptance level.
-
Citations
1 Claim
-
1. A method for searching through vast amounts of content data to identify relevant content data, the method comprising the steps of:
-
executing a search routine based on one or more query terms constructed by an automated routine including highlighting and bookmarking techniques to retrieve a subset of responsive content data; determining a corresponding probability of relevancy for each unit of content data in the responsive content data; and removing from the responsive content data, one or more units of content data that do not reach a threshold probability of relevancy.
-
Specification