Automatic organization of documents through email clustering
First Claim
Patent Images
1. A system comprising:
- computer-readable storage media having processor-executable instructions embodied therein; and
a processor operatively coupled to the computer-readable storage media to execute the processor-executable instructions for implementing computer-executable components comprising;
a clustering component that receives and clusters a plurality of emails, the clustering component automatically determining and creating topics for the emails by assigning key phrases extracted from the emails within one or more clusters, the topics being cohesive concepts relevant to a user associated with the emails;
an organization component that utilizes the topics created from the emails to organize the emails and to also organize other documents of the user within a graphical user interface, the other documents having been stored in a data store separately from the emails and comprising at least one of word processing documents, spreadsheets, presentation files, video files, audio files or digital images; and
an interface component that displays the topics defined by the clustering component and information pertaining to the organized emails and the other documents associated with the topics.
2 Assignments
0 Petitions
Accused Products
Abstract
A system that facilitates organization of emails comprises a clustering component that clusters a plurality of emails and creates topics for emails by assigning key phrases extracted from emails within one or more clusters. An organization component then utilizes the key phrases to organize documents. Furthermore, the organization component can comprise a probability component that determines a probability that a document belongs to a certain topic.
-
Citations
20 Claims
-
1. A system comprising:
-
computer-readable storage media having processor-executable instructions embodied therein; and a processor operatively coupled to the computer-readable storage media to execute the processor-executable instructions for implementing computer-executable components comprising; a clustering component that receives and clusters a plurality of emails, the clustering component automatically determining and creating topics for the emails by assigning key phrases extracted from the emails within one or more clusters, the topics being cohesive concepts relevant to a user associated with the emails; an organization component that utilizes the topics created from the emails to organize the emails and to also organize other documents of the user within a graphical user interface, the other documents having been stored in a data store separately from the emails and comprising at least one of word processing documents, spreadsheets, presentation files, video files, audio files or digital images; and an interface component that displays the topics defined by the clustering component and information pertaining to the organized emails and the other documents associated with the topics. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
employing a processor executing computer executable instructions stored on a computer-readable storage medium to implement the following acts; receiving a plurality of emails; clustering the plurality of emails into multiple clusters; performing key phrase extraction upon emails within at least one of the clusters; characterizing a topic with one or more extracted key phrases, the topic being a cohesive concept that is relevant to a user associated with the plurality of emails, the topic being at least one of;
an activity in which the user participates, an event the user organized or attended, a person or group of people within an organization to which the user belongs, or a project; andautomatically organizing non-email documents of the user stored in a first data store and the plurality of emails based upon the topics characterized with the one or more extracted key phrases from the emails, the non-email documents stored in the data store being organized by comparing content of each non-email document with the key phrases extracted from the multiple clusters of the plurality of emails for associating each non-email document with one or more of the topics, the non-email documents comprising at least one of;
word processing documents, spreadsheets, presentation files, video files, audio files or digital images. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A computer-implemented method comprising:
-
employing at least one processor that executes computer executable code stored in computer-readable storage media to effect the following; receiving a first plurality of emails stored in a first storage device in association with a web-based email account; receiving a second plurality of emails stored in a second storage device associated with an email application installed on a computer including the processor; clustering the first and second plurality of emails, the clustering employing multistage clustering that runs a first cluster technique on the first and second plurality of emails to form one or more clusters, and then runs a second cluster technique on the one or more clusters initialized from the first technique, the second cluster technique being different from the first cluster technique and further refining the one or more clusters so as to facilitate key phrase extraction from emails within the one or more clusters; extracting key phrases from the emails within the one or more clusters and labeling each of the one or more clusters with a subset of the extracted key phrases for establishing a topic for each of the one or more clusters; performing post processing on the topics to remove certain key phrases associated with the topics based on a determination that the certain key phrases occur in multiple emails and are not representative of a topic; organizing non-email documents and the emails within a graphical user interface based upon the topics generated from the clustered emails, the non-email documents being organized by comparing content of each non-email document with the key phrases extracted from the multiple clusters of the plurality of emails for associating each non-email document with one or more of the topics, the non-email documents comprising at least one of;
word processing documents, spreadsheets, presentation files, video files, audio files or digital images; andrendering information related to the organized non-email documents and emails. - View Dependent Claims (20)
-
Specification