Management of broadcast-distributed data entities
First Claim
1. A method in a data processing system having a local storage containing at least one electronic mailbox having one or more electronic mail messages, the method comprising:
- initiating a storage compaction operation to reduce the size of the at least one electronic mailbox, wherein the storage compaction operation replaces the one or more electronic mail messages of the at least one electronic mailbox with identifying information for locating and retrieving the one or more electronic mail messages;
determining, for a particular message of the one or more electronic mail messages in a local mailbox of the at least one electronic mailbox maintained at the local storage, whether at least one copy of the particular message exists in a remote location outside of the local mailbox, wherein the remote location is not maintained by a mail server, wherein the determining further comprises;
the data processing system invoking an internet search engine to locate the at least one copy of the particular message on an Internet, using, as a search query in the internet search engine, one or more statistically improbable words from the particular message;
wherein the determining whether at least one copy of the particular message exists in a remote location further comprises;
invoking the internet search engine to locate the at least one copy of the particular message on the Internet, using, as a search query in the search engine, one or more phrases from the particular message;
identifying a potential copy of the particular message;
determining whether the potential copy is formatted differently than the particular message; and
in response to determining the potential copy is formatted differently than the particular message;
determining whether a human-intended portion of content within the potential copy is the same as a human-intended portion of content within the particular message; and
in response to determining the human-intended portion of content within the potential copy is the same as the human-intended portion of content within the particular message, identifying the potential copy as a copy of the particular message;
in response to a determination that the at least one copy of the particular message exists in the remote location, a processor of the data processing system replacing the particular message in the local mailbox with identifying information for locating and retrieving the at least one copy of the particular message from the remote location, wherein the identifying information comprises at least an address information of the remote location;
receiving a request for the particular message from a client system accessing the local mailbox;
in response to receiving the request for the particular message;
determining whether the particular message has been replaced in the local mailbox with the identifying information;
in response to determining that the particular message has been replaced in the local mailbox with the identifying information, retrieving a copy of the particular message by utilizing the identifying information to identify the remote location and retrieve the copy of the particular message from the remote location;
comparing the located at least one copy with the particular message to determine if the at least one copy is a match to the particular message by applying an algorithm on the at least one copy that enables the processor to verify an integrity of the at least one copy of the particular message during subsequent retrieval of the at least one copy of the particular message from the remote location; and
in response to receiving a request for the particular message and the identifying information comprising a Uniform Resource Locator (URL), accessing the URL on the Internet to retrieve the message content from a web page identified as located at the URL;
converting the copy of the particular message into an electronic mail message format; and
returning the retrieved copy of the particular message directly to the client system in fulfillment of the request.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, computer program product, and data processing system for reducing the storage needed for broadcast-distributed data entities, such as electronic mail messages from a mailing list, are disclosed. Locally stored data entities that are determined to have corresponding copies elsewhere are replaced with identifying information to allow the corresponding copies to be retrieved. In a preferred embodiment, locally-stored electronic mail messages in an electronic mail server that are determined to come from archived mailing lists are replaced periodically with one or more URLs (Uniform Resource Locator) of archived copies of the message. When a request from a mail client to download the electronic mail message is received, the message is reconstructed from the archived copy and returned to the client, rather than being retrieved from local storage.
13 Citations
19 Claims
-
1. A method in a data processing system having a local storage containing at least one electronic mailbox having one or more electronic mail messages, the method comprising:
-
initiating a storage compaction operation to reduce the size of the at least one electronic mailbox, wherein the storage compaction operation replaces the one or more electronic mail messages of the at least one electronic mailbox with identifying information for locating and retrieving the one or more electronic mail messages; determining, for a particular message of the one or more electronic mail messages in a local mailbox of the at least one electronic mailbox maintained at the local storage, whether at least one copy of the particular message exists in a remote location outside of the local mailbox, wherein the remote location is not maintained by a mail server, wherein the determining further comprises; the data processing system invoking an internet search engine to locate the at least one copy of the particular message on an Internet, using, as a search query in the internet search engine, one or more statistically improbable words from the particular message; wherein the determining whether at least one copy of the particular message exists in a remote location further comprises; invoking the internet search engine to locate the at least one copy of the particular message on the Internet, using, as a search query in the search engine, one or more phrases from the particular message; identifying a potential copy of the particular message; determining whether the potential copy is formatted differently than the particular message; and in response to determining the potential copy is formatted differently than the particular message; determining whether a human-intended portion of content within the potential copy is the same as a human-intended portion of content within the particular message; and in response to determining the human-intended portion of content within the potential copy is the same as the human-intended portion of content within the particular message, identifying the potential copy as a copy of the particular message; in response to a determination that the at least one copy of the particular message exists in the remote location, a processor of the data processing system replacing the particular message in the local mailbox with identifying information for locating and retrieving the at least one copy of the particular message from the remote location, wherein the identifying information comprises at least an address information of the remote location; receiving a request for the particular message from a client system accessing the local mailbox; in response to receiving the request for the particular message; determining whether the particular message has been replaced in the local mailbox with the identifying information; in response to determining that the particular message has been replaced in the local mailbox with the identifying information, retrieving a copy of the particular message by utilizing the identifying information to identify the remote location and retrieve the copy of the particular message from the remote location; comparing the located at least one copy with the particular message to determine if the at least one copy is a match to the particular message by applying an algorithm on the at least one copy that enables the processor to verify an integrity of the at least one copy of the particular message during subsequent retrieval of the at least one copy of the particular message from the remote location; and in response to receiving a request for the particular message and the identifying information comprising a Uniform Resource Locator (URL), accessing the URL on the Internet to retrieve the message content from a web page identified as located at the URL; converting the copy of the particular message into an electronic mail message format; and returning the retrieved copy of the particular message directly to the client system in fulfillment of the request. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising a computer readable storage device having functional descriptive material stored thereon that, when executed by a computer having at least one electronic mailbox containing a plurality of electronic mail messages, causes the computer to perform actions that include:
-
initiating a storage compaction operation to reduce the size of the at least one electronic mailbox, wherein the storage compaction operation replaces the one or more electronic mail messages of the at least one electronic mailbox with identifying information for locating and retrieving the one or more electronic mail messages; determining, for a particular message of the one or more electronic mail messages stored in a local mailbox of the at least one electronic mailbox, whether at least one copy of the particular message exists in a remote location outside of the local mailbox, wherein the remote location is not maintained by a mail server, wherein the determining further comprises; invoking an internet search engine to locate the at least one copy of the particular message on an Internet, using, as a search query in the internet search engine, one or more statistically improbable words from the particular message; identifying a potential copy of the particular message; determining whether the potential copy is formatted differently than the particular message; and in response to determining the potential copy is formatted differently than the particular message; determining whether a human-intended portion of content within the potential copy is the same as a human-intended portion of content within the particular message; and in response to determining the human-intended portion of content within the potential copy is the same as the human-intended portion of content within the particular message, identifying the potential copy as a copy of the particular message; in response to a determination that the at least one copy of the particular message exists in the remote location, replacing the particular message within the local mailbox with identifying information for locating, confirming and retrieving the at least one copy of the particular message from the remote location, wherein the identifying information comprises at least address information of the remote location; receiving a request for the particular message from a client system accessing the local mailbox; in response to receiving the request for the particular message; determining whether the particular message has been replaced in the local mailbox with the identifying information; in response to determining that the particular message has been replaced in the local mailbox with the identifying information, retrieving a copy of the particular message by utilizing the identifying information to identify the remote location and retrieve the copy of the particular message from the remote location; converting the copy of the particular message into an electronic mail message format; returning the retrieved copy of the particular message directly to the client system in fulfillment of the request; comparing the located at least one copy with the particular message to determine if the at least one copy is a match to the particular message by applying an algorithm on the at least one copy that enables the processor to verify an integrity of the at least one copy of the particular message during subsequent retrieval of the at least one copy of the particular message from the remote location; and in response to receiving a request for the particular message and the identifying information comprising a Uniform Resource Locator (URL), accessing the URL on the Internet to retrieve the message content from a web page identified as located at the URL. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A data processing system comprising:
-
at least one processor; a memory coupled to the at least one processor; data storage accessible to the at least one processor; at least one electronic mailbox in the data storage, wherein the at least one electronic mailbox stores multiple electronic mail messages; and a set of instructions in the data storage executing on the at least one processor that; initiates a storage compaction operation to reduce the size of the at least one electronic mailbox, wherein the storage compaction operation replaces the one or more electronic mail messages of the at least one electronic mailbox with identifying information for locating and retrieving the one or more electronic mail messages; determines, for a particular message of the one or more electronic mail messages contained within a local mailbox of the at least one electronic mailbox, whether at least one copy of the particular message exists at the remote location that is outside of the local mailbox, wherein the remote location is not maintained by a mail server; and invokes an internet search engine to locate the at least one copy of the particular message on an Internet, using, as a search query in the internet search engine, one or more statistically improbable words from the particular message; identifies a potential copy of a particular message; determines whether the potential copy is formatted differently than the particular message; in response to determining the potential copy is formatted differently than the particular message; determines whether a human-intended portion of content within the potential copy is the same as a human-intended portion of content within the particular message; and in response to determining the human-intended portion of content within the potential copy is the same as the human-intended portion of content within the particular message, identifies the potential copy as a copy of the particular message; and in response to a determination that at least one copy of the particular message exists in the remote location, replaces the particular message within the local mailbox with identifying information for locating and retrieving the at least one copy of the particular message from the remote location, wherein the identifying information comprises at least an address information of the remote location; receives a request for the particular message from a client system accessing the local mailbox; in response to receiving the request for the particular message; determines whether the particular message has been replaced in the local mailbox with the identifying information; in response to determining that the particular message has been replaced in the local mailbox with the identifying information, retrieves a copy of the particular message by utilizing the identifying information to identify the remote location and retrieve the copy of the particular message from the remote location; compares the located at least one copy with the particular message to determine if the at least one copy is a match to the particular message by applying an algorithm on the at least one copy that enables the processor to verify an integrity of the at least one copy of the particular message during subsequent retrieval of the at least one copy of the particular message from the remote location; and in response to receiving a request for the particular message and the identifying information comprising a Uniform Resource Locator (URL), accesses the URL on the Internet to retrieve the message content from a web page identified as located at the URL. - View Dependent Claims (17, 18, 19)
-
Specification