System and method for data extraction from email files
First Claim
1. A method of obtaining data from a messaging file that was generated from a particular messaging environment, the method operated by a system comprising one or more computers, the method comprising the steps of:
- identifying a messaging file on an electronic storage media, the messaging file comprising data, the data organized into a plurality of entries;
identifying a plurality of entries within the messaging file by examining the data of the messaging file, the identifying comprising identifying the type of data within at least two entries of the plurality of entries based at least in part on an identifying signature, wherein the type of data of one of the two entries is a calendar item entry and wherein the type of data of the remaining entries is selected from the group comprised of;
an email message entry and an address book entry, wherein the identifying operation is performed by the system without the particular messaging environment used to create the messaging file,wherein when the type of data of at least one entry is an email message entry, the email message entry has at least one attribute selected from the group of;
addressee, addressor, copies, title, content, and time sent;
wherein when the type of data of at least one entry is an address book entry, the address book entry has at least one attribute selected from the group of;
name, address, telephone number, fax number, mobile (cellular) number, and email address;
wherein when the type of data of at least one entry is a calendar item, the calendar item entry has at least one attribute selected from the group of;
start time, end time, title, location or reminder;
accessing information to determine a logical format for the data corresponding to the type of data of the entry and to determine a location within the messaging file of an attribute of the entry;
extracting the data corresponding to the plurality of entries identified within the messaging file, wherein the extracting operation is performed by the system without the particular messaging environment for the messaging file; and
storing the data extracted in a different format, wherein the storing operation is performed by the system without the particular messaging environment for the messaging file.
19 Assignments
0 Petitions
Accused Products
Abstract
A method and system can be used to read and obtain data from messaging files regardless of the messaging environment used to generate the messaging files. The method and system can read part of a messaging file to identify the type of entry (e.g., email message, calendar item, address book entry, etc.) and access information on where information within the entry is located within the messaging file based on an identifying signature. The method and system can be used to obtain data from messaging files without having to recreate the messaging environment, including individual email accounts. The data can be stored in a target storage medium in a format that is more usable and more easily searched.
142 Citations
18 Claims
-
1. A method of obtaining data from a messaging file that was generated from a particular messaging environment, the method operated by a system comprising one or more computers, the method comprising the steps of:
-
identifying a messaging file on an electronic storage media, the messaging file comprising data, the data organized into a plurality of entries; identifying a plurality of entries within the messaging file by examining the data of the messaging file, the identifying comprising identifying the type of data within at least two entries of the plurality of entries based at least in part on an identifying signature, wherein the type of data of one of the two entries is a calendar item entry and wherein the type of data of the remaining entries is selected from the group comprised of;
an email message entry and an address book entry, wherein the identifying operation is performed by the system without the particular messaging environment used to create the messaging file,wherein when the type of data of at least one entry is an email message entry, the email message entry has at least one attribute selected from the group of;
addressee, addressor, copies, title, content, and time sent;wherein when the type of data of at least one entry is an address book entry, the address book entry has at least one attribute selected from the group of;
name, address, telephone number, fax number, mobile (cellular) number, and email address;wherein when the type of data of at least one entry is a calendar item, the calendar item entry has at least one attribute selected from the group of;
start time, end time, title, location or reminder;accessing information to determine a logical format for the data corresponding to the type of data of the entry and to determine a location within the messaging file of an attribute of the entry; extracting the data corresponding to the plurality of entries identified within the messaging file, wherein the extracting operation is performed by the system without the particular messaging environment for the messaging file; and storing the data extracted in a different format, wherein the storing operation is performed by the system without the particular messaging environment for the messaging file. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A data processing system having code embodied therein for obtaining data from a messaging file that was generated from a particular messaging environment of incompatible hardware and software configuration to that of the data processing system obtaining the data, the data processing system comprising:
-
a memory configured to store the code; a processor for performing operations to identify a messaging file in accordance with the code, wherein the code further comprises; at least one instruction for identifying a messaging file on an electronic storage media, the messaging file comprising data, the data organized into a plurality of entries; at least one instruction for identifying a plurality of entries within the messaging file by examining the data of the messaging file, the identifying comprising identifying the type of data within at least two entries of the plurality of entries based at least in part on an identifying signature, wherein the type of data of one of the two entries is a calendar item entry and wherein the type of data of the remaining entries is selected from the group comprised of;
an email message entry and an address book entry, wherein the identifying operation is performed by the processor without the particular messaging environment used to create the messaging file,wherein when the type of data of at least one entry is an email message entry, the email message entry has at least one attribute selected from the group of;
addressee, addressor, copies, title, content, and time sent;wherein when the type of data of at least one entry is an address book entry, the address book entry has at least one attribute selected from the group of;
name, address, telephone number, fax number, mobile (cellular) number, and email address;wherein when the type of data of at least one entry is a calendar item, the calendar item entry has at least one attribute selected from the group of;
start time, end time, title, location or reminder;at least one instruction for accessing information to determine a logical format for the data corresponding to the type of data of the entry and to determine a location within the messaging file of an attribute of the entry; at least one instruction for extracting the data corresponding to the entry, wherein the extraction is performed by the system that lacks the particular messaging environment for the messaging file; and at least one instruction for storing the data in a different format, wherein the storage is performed by the system that lacks the particular messaging environment for the messaging file. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A system for reading and obtaining data from a plurality of hardware and software systems used to operate communications generated in heterogeneous environments, the system comprising:
-
one or more computers with processing capability, the processing capability identifying a communication file on an electronic storage medium, the communication file comprising data, the data organized into a plurality of entries; the processing capability further configured to identify a plurality of entries within the communication file, the identifying comprising identifying the type of data within at least two entries of the plurality of entries based at least in part on an identifying signature, wherein the type of data of one of the two entries is a calendar item entry and wherein the type of data of the remaining entries is selected from the group comprised of;
an email message entry and an address book entry, wherein the identifying operation is performed by the processing capability without the particular environment used to create the communication file,wherein when the type of data of at least one entry is an email message entry, the email message entry has at least one attribute selected from the group of;
addressee, addressor, copies, title, content, and time sent;wherein when the type of data of at least one entry is an address book entry, the address book entry has at least one attribute selected from the group of;
name, address, telephone number, fax number, mobile (cellular) number, and email address;wherein when the type of data of at least one entry is a calendar item, the calendar item entry has at least one attribute selected from the group of;
start time, end time, title, location or reminder;the processing capability further configured to access information to determine the logical format for the data corresponding to the type of data of the entry and to determine a location within the communication file of an attribute of the entry; an extraction engine configured to extract data corresponding to the entry identified within the communication file, from any environment; and a target storage medium for storing the data extracted in a format, different from the format in which the communication was generated, to enable users to interpret the data without reliance on the communication environment used to generate the communication file. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification