×

Text processing system and methods for automated topic discovery, content tagging, categorization, and search

  • US 9,483,532 B1
  • Filed: 05/24/2015
  • Issued: 11/01/2016
  • Est. Priority Date: 01/29/2010
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer system, comprising:

  • a processor operable toreceive a text content comprising a plurality of terms, each term comprising one or more words or phrases;

    tokenize the text content into a plurality of terms, each term comprising one or more words or phrases;

    identifying a first semantic attribute or a first part of speech, wherein the first semantic attribute is selected from the group of semantic attributes consisting of at least an action, a thing, a person, an agent of an action, a recipient of an action or a thing, a state of an object, a mental state of a person, a physical state of a person, a positive or negative opinion, a name of a product, a name of a service, a name of an organization, wherein the first part of speech is selected from the group of parts of speech consisting of at least a noun or a pronoun, a transitive or intransitive verb or modal verb or link verb, an adjective, an adverb, a preposition, an article, a conjunction;

    identify a first term in the text content, wherein the first term is associated with the first semantic attribute or the first part of speech;

    identify a second term in the text content, wherein the second term is not associated with the first semantic attribute or the first part of speech;

    associate an importance measure to the first term, based at least on the first semantic attribute or the first part of speech, to mark the first term as bearing more importance than the second term in representing a topic or an information focus in the text content;

    extract the first term based on the importance measure; and

    output the first term;

    when the first term is output, the function of the first term includes being a tag or a label to represent a topic or a summary of the text content, or a category node;

    when the first term is output and displayed, the display format includes the font type, size, color, shape, position, or orientation of the first term based on the importance measure;

    when the text content containing the first term is made searchable using a query or is associated with a search index to produce a search result, the search result is ranked based at least on the importance measure.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×