×

Creation of a category tree with respect to the contents of a data stock

  • US 8,745,069 B2
  • Filed: 11/08/2010
  • Issued: 06/03/2014
  • Est. Priority Date: 05/08/2008
  • Status: Active Grant
First Claim
Patent Images

1. A system for analyzing data to establish a category tree comprising:

  • a data source;

    an inventory representation of data in communication with the data source;

    a computer unit having a processor in communication with said data source and said inventory representation of data;

    software executing on said processor to;

    1. create a list of words of each element within the inventory representation of data;

    2. filter out stop words in each of said list of words;

    3. calculate a significance value for each word remaining in each said list of words;

    4. sort said list of words in descending order according to the significance values to create a sorted list of words;

    5. reduce said sorted list of words to a maximum number of top elements to create a reduced list of words;

    6. store said reduced list of words in a persistent memory;

    7. detect co-occurrences within the stored reduced list of words;

    8. store said co-occurrences as a table in the persistent memory;

    9. retrieve words from the stored reduced list of words which have the highest significance values but which have no co-occurrences with each other;

    10. establish a first level of the category tree using said retrieved words;

    11. retrieve a list of co-occurrences for each word of said first level from said stored reduced list of words;

    12. create a corresponding list of words for each said list of co-occurrences having no co-occurrences with each other;

    13. calculate a frequency of co-occurrences for each of said corresponding list of words;

    14. sort said corresponding list of words in descending order according to the frequency to create a sorted corresponding list of words;

    15. reduce said sorted corresponding list of words to a predetermined maximum number of top elements to create a reduced corresponding list of words;

    16. establish a subordinate level of the category tree using said reduced corresponding list of words; and

    ,17. iteratively repeat steps 11 through 16 while no further co-occurrences can be retrieved from said persistent memory for a set of superior categories, wherein in step 11 the retrieved co-occurrences exists for all superior categories in said category tree;

    wherein the category tree is consolidated for display on a display device.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×