×

Systems and methods for facilitating the gathering of open source intelligence

  • US 10,235,421 B2
  • Filed: 01/16/2014
  • Issued: 03/19/2019
  • Est. Priority Date: 08/15/2011
  • Status: Active Grant
First Claim
Patent Images

1. A system for use in extracting a textual hierarchy from one or more webpages of a website for use in creating a hierarchical signature for the website, the system comprising:

  • a processor; and

    a memory device logically connected to the processor and comprising a set of computer readable instructions executable by the processor to;

    receive the x most frequently disclosed terms among one or more pages of a website, wherein x is a positive integer;

    first cluster the x most frequently disclosed terms into two or more sets of terms as a function of semantic similarity between the terms in each set;

    second cluster the two or more sets of terms into two or more subsets of terms by removing at least one term from a first of the two or more sets of terms based on contextual information located proximate the terms in the first set, wherein each of the terms in the subsets comprises a lower level term;

    ascertain, for each of the two or more subsets, an upper level term that semantically encompasses each of the lower level terms in the subset, wherein the set of computer readable instructions executable by the processor to ascertain include one of;

    utilizing a centroid of the subset as the upper level term for the subset, where the centroid is the keyword in the center of the subset;

    orutilizing a deepest common root in a general purpose ontology including the lower level terms of the subset as the upper level term for the subset;

    determine a prevalence of each of the lower level terms on the one or more pages of the website to obtain hierarchical signatures of the lower level terms;

    use the hierarchical signatures of the lower level terms to establish hierarchical signatures for each of the upper level terms, wherein a hierarchical signature of the website comprises the hierarchical signatures of the upper level terms; and

    present, on a display, a graphical representation of the hierarchical signature of the website.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×