Text mining system for web-based business intelligence applied to web site server logs
First Claim
1. A text mining system for providing data representing Internet activities of a visitor to a web site of a business enterprise, comprising:
- a data acquisition process, operable to;
extract visitor identification data from a server log of the web site, wherein the visitor identification data identifies a visitor to the web site at a known time;
aggregate the visitor identification data with visitor purchase data to provide aggregated visitor data that represents whether a purchase was made from the website by the visitor at or near the known time;
extract text documents from Internet-wide text sources, the Internet-wide text sources selected from the group of;
newsgroups, discussion forums, and mailing lists to provide visitor related documents; and
extract predictive statistics from the aggregated visitor data to provide extracted predictive statistics;
a server, operable to;
receive one or more queries, wherein each query of the one or more queries represents a request for information about the visitor and the visitor related documents; and
provide responses to the one or more queries based on the received one or more queries and the aggregated visitor data, the extracted predictive statistics, and the visitor related documents;
wherein the server is accessible via a web browser over the Internet.
16 Assignments
0 Petitions
Accused Products
Abstract
A text mining system for collecting business intelligence about a client, as well as for identifying prospective customers of the client, for use in a lead generation system accessible by the client via the Internet. The text mining system has various components, including a data acquisition process that extracts textual data from Internet web sites, including their logs, content, processes, and transactions. The system compares log data to content and process data, and relates the results of the comparison to transaction data. This permits the system to provide aggregate cluster data representing statistics useful for customer lead generation.
-
Citations
15 Claims
-
1. A text mining system for providing data representing Internet activities of a visitor to a web site of a business enterprise, comprising:
-
a data acquisition process, operable to; extract visitor identification data from a server log of the web site, wherein the visitor identification data identifies a visitor to the web site at a known time; aggregate the visitor identification data with visitor purchase data to provide aggregated visitor data that represents whether a purchase was made from the website by the visitor at or near the known time; extract text documents from Internet-wide text sources, the Internet-wide text sources selected from the group of;
newsgroups, discussion forums, and mailing lists to provide visitor related documents; andextract predictive statistics from the aggregated visitor data to provide extracted predictive statistics; a server, operable to; receive one or more queries, wherein each query of the one or more queries represents a request for information about the visitor and the visitor related documents; and provide responses to the one or more queries based on the received one or more queries and the aggregated visitor data, the extracted predictive statistics, and the visitor related documents; wherein the server is accessible via a web browser over the Internet. - View Dependent Claims (2)
-
-
3. A text mining method for providing data representing Internet activities of a visitor to a website of a business enterprise, comprising:
-
extracting visitor identification data from a server log of the web site, the data identifying a visitor to the website at a known time aggregating the visitor identification data with visitor purchase data to provide aggregated visitor data that represents whether a purchase was made from the website by the visitor at or near the same time; and extracting text documents from Internet-wide text sources other than the website, the Internet-wide text sources selected from the group of;
newsgroups, discussion forums, and mailing lists to provide visitor related documents;extracting predictive statistics based on said extracting visitor identification data, said aggregating, and said extracting the text documents to provide extracted predictive statistics; receiving one or more queries, wherein each query of the one or more queries represents a request for information about the visitor and the visitor related documents; generating results based on the one or more queries and the aggregated visitor data, the extracted predictive statistics, and the visitor related documents; and storing the generated results. - View Dependent Claims (4)
-
-
5. A method, comprising:
-
extracting visitor identification data from a server log of a website of an e-commerce client, wherein the visitor identification data identifies a visitor to the website at a known time; aggregating the visitor identification data with information related to at least one of web data or processes occurring at or near the known time to generate aggregated visitor data; determining whether the visitor purchased a product at or near the known time; storing information regarding the visitor and activity of the visitor based on said determining, operable to be provided to the e-commerce client; and extracting predictive statistics from the information regarding the visitor and the activity of the visitor to provide extracted predictive statistics, wherein the extracted predictive statistics are operable to be provided to the e-commerce client. - View Dependent Claims (6, 7)
-
-
8. A method, comprising:
-
extracting first information comprising visitor identification data from a server log of a website, wherein the visitor identification data identifies a visitor to the website at a known time; determining second information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time, wherein the second information corresponds to activity of the visitor; determining third information regarding whether the visitor purchased a product at or near the known time; extracting predictive statistics based on the first information, the second information, and the third information to provide extracted predictive statistics; storing the first information comprising the visitor identification data, the second information comprising the activity of the visitor, the third information regarding visitor purchase in a memory, and the extracted predictive statistics; wherein the first, second, third information, and extracted predictive statistics are useable to evaluate the website. - View Dependent Claims (9)
-
-
10. A system, comprising:
-
a server log that stores information regarding visitors to a website; at least one first server operable to; extract visitor identification data from the server log of the website, wherein the visitor identification data identifies a visitor to the website at a known time; determine first information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time; determine whether the visitor purchased a product at or near the known time based on the visitor identification data and the first information; and extract predictive statistics based on information regarding the visitor and activity of the visitor to provide extracted predictive statistics; wherein the at least one first server comprises a memory operable to store the visitor identification data, the first information, the extracted predictive statistics, and information regarding whether the visitor purchased a product at the known time; wherein the at least one first server comprises a web server interface accessible by a client web browser to provide statistics regarding visitors to the website.
-
-
11. A computer readable memory medium storing program instructions executable by a processor to:
-
extract first information comprising visitor identification data from a server log of a website, wherein the visitor identification data identifies a visitor to the website at a known time; determine second information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time, wherein the second information corresponds to activity of the visitor; determine third information regarding whether the visitor purchased a product at or near the known time; store the first information comprising the visitor identification data, the second information comprising the activity of the visitor, and the third information regarding visitor purchase in a memory; wherein the first, second, and third information are useable to evaluate the website; and extract predictive statistics from at least one of the first, second, or third information to provide extracted predictive statistics, wherein the extracted predictive statistics are usable to evaluate the website. - View Dependent Claims (12, 13, 14, 15)
-
Specification