Generation and use of an email frequent word list
First Claim
Patent Images
1. A method for generating a mailbox specific frequent word list associated with a mailbox, comprising:
- performing an index scan on catalogs to retrieve search data mapping words to emails containing the words, the search data provided across multiple mailboxes, the search data comprising an inverted index mapping each of the words to one or more email identifiers identifying each of the emails that contain each of the words;
generating a universal frequent word list of the emails based on the search data, the universal frequent word list comprising the words contained in the search data and a word frequency associated with each of the words across the multiple mailboxes; and
generating a plurality of mailbox specific frequent word lists based on the universal frequent word list, each of the plurality of mailbox specific frequent word lists corresponding to one of the multiple mailboxes, each of the plurality of mailbox specific frequent word lists comprising words contained in emails of the corresponding one of the multiple mailboxes and a frequency that the words appear in the emails of the corresponding one of the multiple mailboxes, wherein performing the index scan comprisesreceiving, from an external application, a request for at least one mailbox specific frequent word list of the plurality of mailbox specific frequent word lists,upon receiving the request, determining whether the universal frequent word list has been created,upon determining that the universal frequent word list has not been created, performing the index scan on catalogs to retrieve search data mapping words to emails containing the words,upon determining that the universal frequent word list has been created, determining whether the universal frequent word list is current,upon determining that the universal frequent word list is not current, performing the index scan on catalogs to retrieve search data mapping words to emails containing the words, andupon determining that the universal frequent word list is current, proceeding directly to generating the plurality of mailbox specific frequent word lists based on the universal frequent word list by filtering the words and the corresponding word frequencies associated with the mailbox.
2 Assignments
0 Petitions
Accused Products
Abstract
Technologies are described herein for generating a mailbox specific frequent word list associated with a mailbox. In one method, an index scan is performed on catalogs to retrieve search data mapping words to emails containing the words. The search data is provided across multiple mailboxes. A universal frequent word list is generated based on the search data. The mailbox specific frequent word list is generated based on the universal frequent word list.
22 Citations
16 Claims
-
1. A method for generating a mailbox specific frequent word list associated with a mailbox, comprising:
-
performing an index scan on catalogs to retrieve search data mapping words to emails containing the words, the search data provided across multiple mailboxes, the search data comprising an inverted index mapping each of the words to one or more email identifiers identifying each of the emails that contain each of the words; generating a universal frequent word list of the emails based on the search data, the universal frequent word list comprising the words contained in the search data and a word frequency associated with each of the words across the multiple mailboxes; and generating a plurality of mailbox specific frequent word lists based on the universal frequent word list, each of the plurality of mailbox specific frequent word lists corresponding to one of the multiple mailboxes, each of the plurality of mailbox specific frequent word lists comprising words contained in emails of the corresponding one of the multiple mailboxes and a frequency that the words appear in the emails of the corresponding one of the multiple mailboxes, wherein performing the index scan comprises receiving, from an external application, a request for at least one mailbox specific frequent word list of the plurality of mailbox specific frequent word lists, upon receiving the request, determining whether the universal frequent word list has been created, upon determining that the universal frequent word list has not been created, performing the index scan on catalogs to retrieve search data mapping words to emails containing the words, upon determining that the universal frequent word list has been created, determining whether the universal frequent word list is current, upon determining that the universal frequent word list is not current, performing the index scan on catalogs to retrieve search data mapping words to emails containing the words, and upon determining that the universal frequent word list is current, proceeding directly to generating the plurality of mailbox specific frequent word lists based on the universal frequent word list by filtering the words and the corresponding word frequencies associated with the mailbox. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for generating a mailbox specific frequent word list associated with a mailbox, comprising:
-
receiving, from an external application, a request for the mailbox specific frequent word list from a plurality of mailbox specific frequent word lists; upon receiving the request, determining whether a universal frequent word list has been created, the universal frequent word list comprising a mapping of words to corresponding word frequencies across multiple mailboxes; upon determining that the universal frequent word list has been created, determining whether the universal frequent word list is current; upon determining that the universal frequent word list is not current, performing an index scan on a global catalog to retrieve an inverted index mapping words to email identifiers corresponding to the emails containing the words; and upon determining that the universal frequent word list is current, proceeding directly to generating the plurality of mailbox specific frequent word lists based on the universal frequent word list by filtering the words contained in emails associated with the mailbox and the corresponding word frequencies associated with the mailbox; upon determining that the universal frequent word list has not been created, performing the index scan on the global catalog to retrieve the inverted index mapping each of the words to the email identifiers corresponding to the emails containing the words; generating the universal frequent word list of the emails based on the inverted index; and generating the plurality of mailbox specific frequent word lists based on the universal frequent word list by filtering the words and the corresponding word frequencies associated with the mailbox, each of the plurality of mailbox specific frequent word lists corresponding to one of the multiple mailboxes, each of the plurality of mailbox specific frequent word lists comprising words contained in the corresponding one of the multiple mailboxes and the word frequencies that the words appear in the corresponding one of the multiple mailboxes. - View Dependent Claims (10, 11, 12, 13)
-
-
14. An apparatus comprising:
-
a processor; and a computer-readable storage medium having instructions executable by the processor stored thereon, which, when executed by the processor, cause the processor to provide an application programming interface (“
API”
) for generating a mailbox specific frequent word list associated with a mailbox, the API comprising;a first object adapted to receive, from an external caller application, a request for the mailbox specific frequent word list from a plurality of mailbox specific frequent word lists, upon receiving the request, determine whether the universal frequent word list has been created; a second object adapted to upon determining that the universal frequent word list has not been created, determine whether the universal frequent word list is current, and upon determining that the universal frequent word list is not current, perform an index scan on a global catalog to retrieve an inverted index mapping words to one or more email identifiers identifying each of the emails that contain each of the words; a third object adapted to generate a universal frequent word list of the emails based on the inverted index, the universal frequent word list of the emails comprising a mapping of the words to corresponding word frequencies across multiple mailboxes, each of the word frequencies specifying a number of emails that contain one of the words; a fourth object adapted to, upon determining that the universal frequent word list is current, generate the plurality of mailbox specific frequent word lists based on the universal frequent word list by filtering the words contained in emails associated with the mailbox and the corresponding word frequencies associated with the mailbox, each of the plurality of mailbox specific frequent word lists corresponding to one of the multiple mailboxes, each of the plurality of mailbox specific frequent word lists comprising words contained in the corresponding one of the multiple mailboxes and the word frequencies that the words appear in the emails of the corresponding one of the multiple mailboxes; and a fifth object adapted to transmit the mailbox specific frequent word list to the external caller application in response to the first object receiving the request. - View Dependent Claims (15, 16)
-
Specification