×

Automatically generating a topic description for text and searching and sorting text by topic using the same

  • US 5,937,422 A
  • Filed: 04/15/1997
  • Issued: 08/10/1999
  • Est. Priority Date: 04/15/1997
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of automatically generating a topical description of text, comprising the steps of:

  • a) receiving the text, where the text consists of one or more input words;

    b) stemming each input word to its root form;

    c) assigning a user-definable part-of-speech score β

    i to each input word;

    d) assigning a language salience score Si to each input word;

    e) assigning an input-word score to each input word that is a function of the corresponding input word'"'"'s part-of-speech score β

    i, language salience score Si, and the number of times the corresponding input word appears in the text;

    f) creating a tree structure under each input word, where each tree structure contains the definition of the corresponding input word, where each definition word may be further defined to a user-definable number of levels;

    g) assigning a definition-word score Ai,t j! to each definition word in each tree structure based on the definition word'"'"'s part-of-speech score β

    j, the language salience score of the word the definition word defines, a relational salience score Rk,j, and a user-definable factor W;

    h) collapsing each tree structure to a corresponding tree-word list, where each tree-word list contains the unique words contained in the corresponding tree structure;

    i) assigning a tree-word-list score to each word in each tree-word list, where each tree-word-list score is a function of the scores of the corresponding word that existed in the corresponding uncollapsed tree structure;

    j) combining the tree-word lists into a final word list, where the final word list contains the unique words contained in the tree-word lists;

    k) assigning a final-word-list score Afi j! to each word in the final word list, where Afi j! is a function of the corresponding word'"'"'s dictionary salience and tree-word-list scores; and

    l) choosing the top N scoring words in the final word list as the topic description of the input text, where the value N may be defined by the user.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×