×

Automated topic discovery in documents and content categorization

  • US 9,047,283 B1
  • Filed: 12/07/2012
  • Issued: 06/02/2015
  • Est. Priority Date: 01/29/2010
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method implemented on a computer comprising one or more processors and memory, comprising:

  • receiving a text content;

    tokenizing the text content into a plurality of terms, each term comprising one or more words or phrases;

    identifying a first semantic attribute, wherein the first semantic attribute is selected from the group of attributes consisting of at least an action, a thing, a person, an agent of an action, a recipient of an action or a thing, a state of an object, a mental state of a person, a physical state of a person, a positive or negative opinion, a name of a product, a name of a service, a name of an organization;

    identifying a first term in the text content, wherein the first term is associated with the first semantic attribute;

    identifying a second term in the text content, wherein the second term is not associated with the first semantic attribute;

    assigning an importance value to the first term as bearing more importance than the second term based on the first semantic attribute, wherein the importance value is a measurement for the role of the first term in representing a topic or an information focus in the text content; and

    outputting the first term or the second term to represent the content of the document,when the first term is output, the function of the first term includes being a tag or a label to represent a topic or a summary of the text content, or a category node,when the first term and the second term are output and displayed, the display format includes selecting the font type, size, color, shape, position, or orientation of or distance between the first term and the second term based on the importance value,when the text content containing the first term is made searchable using a query or is associated with a search index to produce a search result, the search result is ranked based at least on the importance value.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×