×

Systems and methods for enterprise data search and analysis

  • US 10,372,718 B2
  • Filed: 11/03/2015
  • Issued: 08/06/2019
  • Est. Priority Date: 11/03/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method of analyzing a search of a plurality of text documents by a search system comprising a plurality of computing nodes comprising at least a processor coupled to a non-transitory memory, at least one network-attached storage device coupled to the plurality of computing nodes, a system management module comprising at least a processor coupled to a non-transitory memory, the system management module coupled to the plurality of computing nodes and configured to run at least one system management software, and a network management module coupled to the system management module and configured to communicate with a network, the search resulting in a set of expanded search terms, a search document set, and a plurality of passages of interest, wherein the plurality of passages of interest are portions of the plurality of text documents generated by the search and are divided into a plurality of groups, comprising the steps of:

  • obtaining all passages of interest generated by the search;

    determining all unique roots of interest included in each group, wherein each root of interest corresponds to terms wherein the term is the same as the root of interest and terms wherein the root of interest is the root of the term;

    listing of all unique roots of interest for each group and a number of times terms corresponding to the root of interest occur in the group for each unique root of interest;

    ranking of roots of interest in each group in order of occurrence in the group;

    determining all unique repeating term sequences in each group, wherein each repeating term sequence comprises two or more contiguous terms;

    listing of all unique repeating term sequences for each group and a number of times each unique repeating term sequence occurs in the group;

    ranking of all repeating term sequences in each group in order of occurrence in the group;

    determining all concepts of interest in each group, wherein each concept of interest corresponds to a first root term associated with a second different root term, wherein each concept of interest is an occurrence, in one passage of interest, of one term of a first term group occurring in the passage of interest prior to the occurrence of one term of a second term group, the first term group consisting of the first root term and stems of the first root term and the second term group consisting of the second root term and stems of the second root term, wherein the one of the first term group is separated from the one of the second term group by at least one other term and by fewer than a predetermined context window of terms;

    identifying of all unique concepts of interest for each group;

    listing of all unique concepts of interest for each group and a number of times each unique concept of interest occurs in the group;

    ranking of all concepts of interest in each group in order of occurrence in the group;

    determining all unique general identifiers in each group, wherein each general identifier comprises a non-word term in the group;

    listing of all unique general identifiers and a number of times each unique general identifier occurs in the group; and

    ranking of all general identifiers in each group in order of occurrence in the group.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×