×

Very-large-scale automatic categorizer for web content

  • US 6,826,576 B2
  • Filed: 09/25/2001
  • Issued: 11/30/2004
  • Est. Priority Date: 05/07/2001
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of training a classifier system by utilizing previously classified data objects comprising one or more electronic documents including at least one of a text document, an image file, an audio sequence, a video sequence, and a hybrid document including a combination of text and images, said previously classified data objects being organized into a subject hierarchy of a plurality of nodes, the method comprising:

  • selecting one node of the plurality of nodes;

    aggregating those of the previously classified data objects corresponding to the selected node and any associated sub-nodes of the selected node, to form a content class of data objects, said content class of data objects comprising a content class of the one or more electronic documents;

    aggregating those of the previously classified data objects corresponding to any associated sibling nodes of the selected node and any associated sub-nodes of the sibling nodes to form an anti-content class of data objects, said anti-content class of data objects comprising an anti-content class of the one or more electronic documents; and

    extracting features from at least one of the content class of data objects and the anti-content class of data objects to facilitate characterization of said previously classified data objects.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×