×

Providing collection transparency information to an end user to achieve a guaranteed quality document search and production in electronic data discovery

  • US 8,140,494 B2
  • Filed: 01/21/2008
  • Issued: 03/20/2012
  • Est. Priority Date: 01/21/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for providing collection transparency information to an end user to achieve a guaranteed quality document search and production in electronic data discovery, comprising the steps of:

  • providing a content management system configured for performing the steps of;

    providing a search results page;

    extracting indexing state and text index-ability information from a content repository with regard to files identified on said search results page;

    classifying said files in said content repository with regard to text index-ability and indexing state as follows;

    not indexable;

    types of files that are known not to be indexable and wherein said content management system does not attempt to perform full-text indexing on said types of files to enhance user experience with time;

    indexable, but not indexed yet;

    indexable files that have not been indexed yet;

    failed to index;

    files that are considered to be indexable for which an indexing attempt failed; and

    indexable and indexed;

    files that were successfully indexed;

    wherein text index-ability is any of;

    not indexable and indexable and wherein indexing state is any of;

    indexed, not indexed yet, and failed to index;

    collecting statistical information on what file types are not indexable by collecting statistics of indexing failure per file type, said indexing failure per file type based on whether;

    a file was of an indexable type but was corrupt;

    a file was of an indexable type but indexing failed;

    files of different types use a same file extension; and

    a file was treated as indexable but it was not an indexable file type;

    observing said collected statistics of indexing failure per file type by, over time, collecting the following information per file type;

    how many files of each type were indexed successfully;

    how many files of each type failed to index; and

    how many files of each type have been uploaded;

    based on said observance of said collected statistics of indexing failure per file type, calculating a ratio of indexing failure in accordance with the following formula;


    ratio of failure of a given type=number of failed files of a given type/number of files of a given type uploaded and attempted to index;

    reporting file types having a high indexing failure ratio;

    classifying said file types having a high indexing failure ratio as not indexable and adding said file types to a do not index list;

    displaying indexing and extraction status of files contained in said content repository and pertaining to a given matter or legal request or a particular search query in the processing status area of the search results page based upon said classifying said files, as well as said extracting indexing state and text index-ability information from said content repository; and

    providing a processing status area of said search results page and a file detail information page for displaying a processing status warning next to a file entry to allow a user to see what processing problems occurred with each file;

    wherein the steps of the method are performed on one or more computing devices.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×