METHOD AND SYSTEM FOR CATEGORIZING TOPIC DATA WITH CHANGING SUBTOPICS
First Claim
1. A method for categorizing data objects into at least one of relevant categories of topics and sub-topics, said method comprising:
- receiving data comprising unstructured data objects;
categorizing said data objects into pre-defined topics;
performing a clustering analysis to identify subtopics of said data objects within said pre-defined topics, wherein said subtopics are more specific than said pre-defined topics;
periodically repeating said clustering analysis to identify at least one of a presence of a new subtopic and an absence of an old subtopic, wherein said new subtopic comprises a group of similar data objects unidentified during a previous clustering analysis and identified during a current clustering analysis, and wherein said old subtopic comprises a group of similar data objects identified during said previous clustering analysis and unidentified during said current clustering analysis;
performing at least one of adding said new subtopic to said subtopics and removing said old subtopic from said subtopics; and
after said adding and said removing, identifying said subtopics and classifying said subtopics into said pre-defined topics.
1 Assignment
0 Petitions
Accused Products
Abstract
The embodiments of the invention provide a method for the automatic identification of changing subtopics within topics. The method begins by receiving customer satisfaction data having unstructured data objects. Next, the data objects are automatically categorized into pre-defined topics, wherein the pre-defined topics do not change throughout the customer satisfaction analysis. The pre-defined topics can be automatically defined based on a history of customer satisfaction data. Following this, a clustering analysis is automatically performed to identify subtopics of the data objects within the pre-defined topics. The subtopics are more specific than the pre-defined topics, and the subtopics can change. Further, the clustering analysis can include extracting features from the data objects and grouping the features into the subtopics. Each of the subtopics includes features having a predetermined degree of similarity.
15 Citations
20 Claims
-
1. A method for categorizing data objects into at least one of relevant categories of topics and sub-topics, said method comprising:
-
receiving data comprising unstructured data objects; categorizing said data objects into pre-defined topics; performing a clustering analysis to identify subtopics of said data objects within said pre-defined topics, wherein said subtopics are more specific than said pre-defined topics; periodically repeating said clustering analysis to identify at least one of a presence of a new subtopic and an absence of an old subtopic, wherein said new subtopic comprises a group of similar data objects unidentified during a previous clustering analysis and identified during a current clustering analysis, and wherein said old subtopic comprises a group of similar data objects identified during said previous clustering analysis and unidentified during said current clustering analysis; performing at least one of adding said new subtopic to said subtopics and removing said old subtopic from said subtopics; and after said adding and said removing, identifying said subtopics and classifying said subtopics into said pre-defined topics. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for categorizing data objects into at least one of relevant categories of topics and sub-topics, said method comprising:
-
receiving data comprising unstructured data objects; categorizing said data objects into pre-defined topics, wherein said pre-defined topics do not change; performing a clustering analysis to identify subtopics of said data objects within said pre-defined topics, wherein said subtopics are more specific than said pre-defined topics; periodically repeating said clustering analysis to identify at least one of a presence of a new subtopic and an absence of an old subtopic, wherein said new subtopic comprises a group of similar data objects unidentified during a previous clustering analysis and identified during a current clustering analysis, and wherein said old subtopic comprises a group of similar data objects identified during said previous clustering analysis and unidentified during said current clustering analysis; performing at least one of adding said new subtopic to said subtopics and removing said old subtopic from said subtopics; and after said adding and said removing, identifying said subtopics and classifying said subtopics into said pre-defined topics. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A program storage device readable by computer, tangibly embodying a program of instructions executable by said computer to perform a method for categorizing data objects into at least one of relevant categories of topics and sub-topics, said method comprising:
-
receiving data comprising unstructured data objects; categorizing said data objects into pre-defined topics; performing a clustering analysis to identify subtopics of said data objects within said pre-defined topics, wherein said subtopics are more specific than said pre-defined topics; periodically repeating said clustering analysis to identify at least one of a presence of a new subtopic and an absence of an old subtopic, wherein said new subtopic comprises a group of similar data objects unidentified during a previous clustering analysis and identified during a current clustering analysis, and wherein said old subtopic comprises a group of similar data objects identified during said previous clustering analysis and unidentified during said current clustering analysis; performing at least one of adding said new subtopic to said subtopics and removing said old subtopic from said subtopics; and after said adding and said removing, identifying said subtopics and classifying said subtopics into said pre-defined topics. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification