×

LARGE SCALE UNSUPERVISED HIERARCHICAL DOCUMENT CATEGORIZATION USING ONTOLOGICAL GUIDANCE

  • US 20120203752A1
  • Filed: 02/08/2011
  • Published: 08/09/2012
  • Est. Priority Date: 02/08/2011
  • Status: Active Grant
First Claim
Patent Images

1. A storage medium storing instructions executable by a processing device to perform a method comprising:

  • generating a hierarchical classifier for a taxonomy of hierarchically organized categories wherein each category is represented by one or more textual category descriptors, the hierarchical classifier being generated by a method including (i) constructing queries from the textual category descriptors representing the categories and querying a documents database using the constructed queries to retrieve pseudo-relevant documents and (ii) extracting language models comprising multinomial distributions over the words of a textual vocabulary for the categories of the taxonomy by inferring a hierarchical topic model representing the taxonomy from at least the pseudo-relevant documents; and

    classifying an input document using the generated hierarchical classifier.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×