PERSONALIZATION ENGINE FOR CLASSIFYING UNSTRUCTURED DOCUMENTS
First Claim
1. A computer-implemented method for classifying an electronic document, the method comprising:
- analyzing author-generated classification information regarding the document with a document classification server and assigning a set of first taxonomic nouns to characterize the document based upon the author-generated classification information;
examining a user-generated tag from a client computer characterizing a portion of the document and assigning a set of second taxonomic nouns to characterize the document based upon the user-generated tag characterization;
identifying a search term that resulted in the user accessing the document from a content provider site server and assigning at set of third taxonomic nouns to characterize the document based upon the search term result;
evaluating attributes related to the manner in which a user accesses the document from the content provider site server and assigning a set of fourth taxonomic nouns to characterize the document based upon the attributes related to the manner in which the document was accessed;
processing the document with a content analysis server to extract a set of fifth taxonomic nouns to characterize the document based upon a predetermined pattern rule;
aggregating the taxonomic nouns with a personalization engine server to determine term vectors that represent the document; and
categorizing the document based upon at least one of the term vectors, the taxonomic nouns, and the author-generated classification scheme with a targeting server.
2 Assignments
0 Petitions
Accused Products
Abstract
Unstructured electronic documents are classified for profiling and targeting users for additional relevant content. Behavioral data is gathered from user activity, and user documents and actions are categorized. Profile information is combined with collaborative and editorial data to provide users with credible information regarding products. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to how the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun. The taxonomic nouns are aggregated to determine term vectors representing the document, and the document is categorized using the term vectors, the taxonomic nouns, or the author-generated classification.
-
Citations
20 Claims
-
1. A computer-implemented method for classifying an electronic document, the method comprising:
-
analyzing author-generated classification information regarding the document with a document classification server and assigning a set of first taxonomic nouns to characterize the document based upon the author-generated classification information; examining a user-generated tag from a client computer characterizing a portion of the document and assigning a set of second taxonomic nouns to characterize the document based upon the user-generated tag characterization; identifying a search term that resulted in the user accessing the document from a content provider site server and assigning at set of third taxonomic nouns to characterize the document based upon the search term result; evaluating attributes related to the manner in which a user accesses the document from the content provider site server and assigning a set of fourth taxonomic nouns to characterize the document based upon the attributes related to the manner in which the document was accessed; processing the document with a content analysis server to extract a set of fifth taxonomic nouns to characterize the document based upon a predetermined pattern rule; aggregating the taxonomic nouns with a personalization engine server to determine term vectors that represent the document; and categorizing the document based upon at least one of the term vectors, the taxonomic nouns, and the author-generated classification scheme with a targeting server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer-implemented method for classifying an electronic document, the method comprising:
-
determining term vectors that represent the document based upon taxonomic nouns, the taxonomic nouns being prepared by one or more of the following steps; analyzing author-generated classification information regarding the document with a document classification server and assigning a set of first taxonomic nouns to characterize the document based upon the author-generated classification information; examining a user-generated tag from a client computer characterizing a portion of the document and assigning a set of second taxonomic nouns to characterize the document based upon the user-generated tag characterization; identifying a search term that resulted in the user accessing the document from a content provider site server and assigning a set of third taxonomic nouns to characterize the document based upon the search term result; evaluating attributes related to the manner in which a user accesses the document from the content provider site server and assigning a set of fourth taxonomic nouns to characterize the document based upon the attributes related to the manner in which the document was accessed; and processing the document with a content analysis server to extract a set of fifth taxonomic nouns to characterize the document based upon a predetermined pattern rule; and categorizing the document based upon at least one of the term vectors, the taxonomic nouns, and the author-generated classification information.
-
-
19. A system for classifying an electronic document, the system comprising:
-
a document classification server configured for analyzing author-generated classification information regarding the document and assigning a set of first taxonomic nouns to characterize the document based upon the author-generated classification information; a content analysis server configured for examining a user-generated tag characterizing a portion of the document and assigning a set of second taxonomic nouns to characterize the document based upon the user-generated tag characterization; a profiling server configured for identifying a search term that resulted in the user accessing the document and assigning at set of third taxonomic nouns to characterize the document based upon the search term result and further configured for evaluating attributes related to the manner in which a user accesses the document and assigning a set of fourth taxonomic nouns to characterize the document based upon the attributes related to the manner in which the document was accessed; wherein the content analysis server is further configured for processing the document to extract a set of fifth taxonomic nouns to characterize the document based upon a predetermined pattern rule; and a personalization server configured for aggregating the taxonomic nouns to determine term vectors that represent the document and for categorizing the document based upon at least one of the term vectors, the taxonomic nouns, and the author-generated classification scheme.
-
-
20. A computer program product with instructions recorded thereon for classifying an electronic document, the computer program product comprising:
-
instructions for analyzing author-generated classification information regarding the document with a document classification server and assigning a set of first taxonomic nouns to characterize the document based upon the author-generated classification information; instructions for examining a user-generated tag from a client computer characterizing a portion of the document and assigning a set of second taxonomic nouns to characterize the document based upon the user-generated tag characterization; instructions for identifying a search term that resulted in the user accessing the document from a content provider site server and assigning at set of third taxonomic nouns to characterize the document based upon the search term result; instructions for evaluating attributes related to the manner in which a user accesses the document from the content provider site server and assigning a set of fourth taxonomic nouns to characterize the document based upon the attributes related to the manner in which the document was accessed; instructions for processing the document with a content analysis server to extract a set of fifth taxonomic nouns to characterize the document based upon a predetermined pattern rule; instructions for aggregating the taxonomic nouns with a personalization engine server to determine term vectors that represent the document; and instructions for categorizing the document based upon at least one of the term vectors, the taxonomic nouns, and the author-generated classification scheme with a targeting server.
-
Specification