×

Creation Of A Category Tree With Respect To The Contents Of A Data Stock

  • US 20110113043A1
  • Filed: 11/08/2010
  • Published: 05/12/2011
  • Est. Priority Date: 05/08/2008
  • Status: Active Grant
First Claim
Patent Images

1. A method for the automatic creation of a category tree with respect to the contents of a data stock comprising information objects, wherein the information objects of the data stock are indexed in an index, characterized by the following process steps:

  • 1. Filtering out stop words for each information object in the index by means of a list;

    2. Creating a list of words in which the stop words which have been filtered out are not contained;

    3. Calculating a significance value for each word in the list of words;

    4. Sorting the list of words according to their significance values;

    5. Reducing the sorted list of words to a maximum number;

    6. Storing the reduced list of words in a table;

    7. Detecting co-occurrences in the stored list of words;

    8. Storing the co-occurrences in a table in a database;

    9. Retrieving words which have the highest significance value but no co-occurrences;

    10. Selecting a first level of the category tree from the retrieved words;

    11. Retrieving words for each selected word of the first level by means of the co-occurrence table, which words are in co-occurrence with the respectively selected word of the first level;

    12. Creating a list of words with the retrieved words;

    13. Retrieving the frequency of each word on the list of words;

    14. Sorting the list of words according to frequency;

    15. Reducing the sorted list of words to a predetermined maximum number, wherein the words which comprise a frequency above average remain on the list of words;

    16. Selecting another level of the category tree on the base of the determined words;

    17. Iteratively repeating of the process steps 11 through 16 for at least one other level of the category tree, wherein in process step 11 during the retrieve of words by means of the co-occurrence table, for each selected word of the first and at least one other level, the words will be retrieved which are in co-occurrence with the respectively selected word of the first and at least one other level, until the quantity of retrieved/selected words is equal to zero.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×