Enterprise web mining system and method
First Claim
1. A computer-implemented method of enterprise web mining comprising the steps of:
- collecting data from a plurality of data sources, including proprietary corporate data comprising at least one of proprietary account or user-based data, external data comprising data acquired from sources external to the system, Web data comprising at least one of Web traffic data, web server application program interface data or Web server log data, and Web transaction data comprising data relating to transactions completed over the Web;
selecting data that is relevant to a desired output from among the collected data by mapping between general attributes and particular features, the selected data having reduced dimensionality relative to the collected data;
pre-processing the selected data by performing at least one of;
removing redundant or irrelevant information from Web server log data, identifying a visitor to a web site from the Web traffic data, reconstructing a session from the Web traffic data, reconstructing a path followed by a visitor in a session from the Web server log data, analyzing a path a whole Website from the Web server log data, converting to filenames from the Web server log data to page titles, and converting IP addresses from the Web traffic data to domain names;
building a plurality of database tables from the pre-processed selected data, wherein the acquired data comprises a plurality of different types of data;
integrating the collected data by forming an integrated database comprising collected data in a coherent format using generated taxonomies to group attributes of the data and using generated profiles of the data;
generating a plurality of data mining models using the collected data; and
generating a prediction or recommendation using at least one of the plurality of generated data mining models, in response to a received request for a recommendation or prediction.
2 Assignments
0 Petitions
Accused Products
Abstract
An enterprise-wide web data mining system, computer program product, and method of operation thereof, that uses Internet based data sources, and which operates in an automated and cost effective manner. The enterprise web mining system comprises: a database coupled to a plurality of data sources, the database operable to store data collected from the data sources; a data mining engine coupled to the web server and the database, the data mining engine operable to generate a plurality of data mining models using the collected data; a server coupled to a network, the server operable to: receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using the data mining models, and transmit the generated prediction or recommendation.
-
Citations
26 Claims
-
1. A computer-implemented method of enterprise web mining comprising the steps of:
-
collecting data from a plurality of data sources, including proprietary corporate data comprising at least one of proprietary account or user-based data, external data comprising data acquired from sources external to the system, Web data comprising at least one of Web traffic data, web server application program interface data or Web server log data, and Web transaction data comprising data relating to transactions completed over the Web;
selecting data that is relevant to a desired output from among the collected data by mapping between general attributes and particular features, the selected data having reduced dimensionality relative to the collected data;
pre-processing the selected data by performing at least one of;
removing redundant or irrelevant information from Web server log data, identifying a visitor to a web site from the Web traffic data, reconstructing a session from the Web traffic data, reconstructing a path followed by a visitor in a session from the Web server log data, analyzing a path a whole Website from the Web server log data, converting to filenames from the Web server log data to page titles, and converting IP addresses from the Web traffic data to domain names;
building a plurality of database tables from the pre-processed selected data, wherein the acquired data comprises a plurality of different types of data;
integrating the collected data by forming an integrated database comprising collected data in a coherent format using generated taxonomies to group attributes of the data and using generated profiles of the data;
generating a plurality of data mining models using the collected data; and
generating a prediction or recommendation using at least one of the plurality of generated data mining models, in response to a received request for a recommendation or prediction. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer program product for performing an enterprise web mining process in an electronic data processing system, comprising:
-
a computer readable medium;
computer program instructions, recorded on the computer readable medium, executable by a processor, for performing the steps of;
collecting data from a plurality of data sources, including proprietary corporate data comprising at least one of proprietary account or user-based data, external data comprising data acquired from sources external to the system, Web data comprising at least one of Web traffic data, web server application program interface data or Web server log data, and Web transaction data comprising data relating to transactions completed over the Web;
selecting data that is relevant to a desired output from among the collected data by mapping between general attributes and particular features, the selected data having reduced dimensionality relative to the collected data;
pre-processing the selected data by performing at least one of;
removing redundant or irrelevant information from Web server log data, identifying a visitor to a web site from the Web traffic data, reconstructing a session from the Web traffic data, reconstructing a path followed by a visitor in a session from the Web server log data, analyzing a path a whole Website from the Web server log data, converting to filenames from the Web server log data to page titles, and converting IP addresses from the Web traffic data to domain names;
building a plurality of database tables from the pre-processed selected data, wherein the acquired data comprises a plurality of different types of data;
integrating the collected data by forming an integrated database comprising collected data in a coherent format using generated taxonomies to group attributes of the data and using generated profiles of the data;
generating a plurality of data mining models using the collected data; and
generating a prediction or recommendation using at least one of the plurality of generated data mining models, in response to a received request for a recommendation or prediction. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system for performing an enterprise web mining process, comprising:
-
a processor operable to execute computer program instructions; and
a memory operable to store computer program instructions executable by the processor, for performing the steps of;
collecting data from a plurality of data sources, including proprietary corporate data comprising at least one of proprietary account or user-based data, external data comprising data acquired from sources external to the system, Web data comprising at least one of Web traffic data, web server application program interface data or Web server log data, and Web transaction data comprising data relating to transactions completed over the Web;
selecting data that is relevant to a desired output from among the collected data by mapping between general attributes and particular features, the selected data having reduced dimensionality relative to the collected data;
pre-processing the selected data by performing at least one of;
removing redundant or irrelevant information from Web server log data, identifying a visitor to a web site from the Web traffic data, reconstructing a session from the Web traffic data, reconstructing a path followed by a visitor in a session from the Web server log data, analyzing a path a whole Website from the Web server log data, converting to filenames from the Web server log data to page titles, and converting IP addresses from the Web traffic data to domain names;
building a plurality of database tables from the pre-processed selected data, wherein the acquired data comprises a plurality of different types of data;
integrating the collected data by forming an integrated database comprising collected data in a coherent format using generated taxonomies to group attributes of the data and using generated profiles of the data;
generating a plurality of data mining models using the collected data; and
generating a prediction or recommendation using at least one of the plurality of generated data mining models, in response to a received request for a recommendation or prediction. - View Dependent Claims (12, 13, 14, 15)
-
-
16. An enterprise web mining system comprising:
-
a database system coupled to a plurality of data sources, the database system operable to store data collected from the data sources, the data sources including proprietary corporate data comprising at least one of proprietary account or user-based data, external data comprising data acquired from sources external to the system, Web data comprising at least one of Web traffic data, web server application program interface data and Web server log data, and Web transaction data comprising data relating to transactions completed over the Web, the database further operable to select data that is relevant to a desired output from among the collected data by mapping between general attributes and particular features, the selected data having reduced dimensionality relative to the collected data, the database further operable to pre-process the selected data by performing at lease one of removing redundant or irrelevant information from Web server log data, identifying a visitor to a web site from the Web traffic data, reconstructing a session from the Web traffic data, reconstructing a path followed by a visitor in a session from the Web server log data, analyzing a path a whole Website from the Web server log data, converting to filenames from the Web server log data to page titles, and converting IP addresses from the Web traffic data to domain names, the database further operable to build a plurality of database tables from the pre-processed selected data, wherein the acquired data comprises a plurality of different types of data, and the database further operable to integrate the collected data by forming an integrated database comprising collected data in a coherent format using generated taxonomies to group attributes of the data and using generated profiles of the data;
a data mining engine coupled to the database, the data mining engine operable to generate a plurality of data mining models using the integrated database;
a server coupled to a network, the server operable to receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using at least one of the data mining models, and transmit the generated prediction or recommendation. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification