×

SYSTEM AND METHOD FOR TEXT CATEGORIZATION BASED ON ONTOLOGIES

  • US 20130212111A1
  • Filed: 04/26/2013
  • Published: 08/15/2013
  • Est. Priority Date: 02/07/2012
  • Status: Active Grant
First Claim
Patent Images

1. A system for text categorization based on ontologies, the system comprising:

  • a plurality of data collector software modules stored and operating on a plurality of network-attached computers;

    a categorizer software module stored and operating on a network-attached server computer; and

    a database server comprising an indexed database of documents and their categorizations, and further comprising a plurality of ontologies, each ontology comprising a plurality of hierarchical taxonomies and each hierarchical taxonomy comprising a plurality of taxons;

    wherein the data collector software modules receive a document to be classified and submit them to the categorizer software module; and

    further wherein the categorizer performs the following steps to categorize each received document;

    splitting the document into sentences;

    selecting words or phrases that are present in one or more of the plurality of ontologies stored in the database server;

    selecting a plurality of subtrees from the plurality of ontologies based on the presence one or more of a set of specific subcategories in the document;

    determining a weight for each subcategory within the set of specific subcategories;

    creating a plurality of modified subtrees by pruning subcategories having a weight below a threshold from each of the selected plurality of subtrees; and

    for each of the plurality of modified subtrees, computing a conditionality coefficient.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×