Fast adaptive document filtering
First Claim
Patent Images
1. A system comprising:
- a first computer-readable storage media having stored thereon a reference dictionary file data structure, the reference dictionary file data structure including terms parsed from a plurality of documents stored in a document repository, and terms parsed from a new document received by the system but not stored in the document repository, the plurality of documents and the new document parsed without regard to any user profile, a user profile updated based at least in part on terms from the new document included in the reference dictionary file data structure and feedback from a user regarding relevance of a document received by the user before the system determines whether the new document is relevant to any user, the user profile specifying one or more areas of interest of the user;
the first computer-readable storage media having stored thereon a parsed term data structure, the parsed term data structure including a one or more parsed terms and a term selection value associated with each of the one or more parsed terms, each of the parsed terms either;
present in the reference dictionary file data structure indicating at least one document indicated as relevant to the user contains the term, orpresent in an original user profile,
at least one of the parsed terms present in an original user profile and not present in the reference dictionary file data, the term selection value used for determining whether the associated term is to be included in the undated user profile; and
a second computer-readable storage media having stored thereon a document dictionary index data structure and the document repository, the document dictionary index data structure including only terms located in the plurality of documents stored in the document repository.
2 Assignments
0 Petitions
Accused Products
Abstract
Data structures, stored on various types of computer-readable media, include information related to user profiles and/or to various documents. The information included in these data structures is arranged and stored in manner that allows for rapid user profile updating to be performed as new or changed documents are processed in a document filtering system.
59 Citations
25 Claims
-
1. A system comprising:
-
a first computer-readable storage media having stored thereon a reference dictionary file data structure, the reference dictionary file data structure including terms parsed from a plurality of documents stored in a document repository, and terms parsed from a new document received by the system but not stored in the document repository, the plurality of documents and the new document parsed without regard to any user profile, a user profile updated based at least in part on terms from the new document included in the reference dictionary file data structure and feedback from a user regarding relevance of a document received by the user before the system determines whether the new document is relevant to any user, the user profile specifying one or more areas of interest of the user; the first computer-readable storage media having stored thereon a parsed term data structure, the parsed term data structure including a one or more parsed terms and a term selection value associated with each of the one or more parsed terms, each of the parsed terms either; present in the reference dictionary file data structure indicating at least one document indicated as relevant to the user contains the term, or present in an original user profile,
at least one of the parsed terms present in an original user profile and not present in the reference dictionary file data, the term selection value used for determining whether the associated term is to be included in the undated user profile; anda second computer-readable storage media having stored thereon a document dictionary index data structure and the document repository, the document dictionary index data structure including only terms located in the plurality of documents stored in the document repository. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system comprising:
-
a main memory having stored a reference dictionary file data structure, the reference dictionary file data structure including terms parsed from a plurality of documents stored in a document repository and terms parsed from a new document received by the system but not stored in the document repository, the terms parsed from the plurality of documents and the new document without regard to any user profile, the reference dictionary file data structure including a reference count value indicating a total number of documents that include a term, the total number of documents including documents stored in the document repository and the new document, the reference count value accessible by a user profile update operation for updating a user profile before the system determines whether the new document is relevant to any user, the terms parsed from the new document added to the reference dictionary file data structure as follows; when a term is included in the document statistics file data structure but not in the reference dictionary file data structure, the term is added to the reference dictionary file data structure and the reference count value in the reference dictionary file data structure associated with the term is initialized to one; when a term is included in both the document statistics file data structure and the reference dictionary file data structure, the reference count value in the reference dictionary file data structure associated with the term is incremented; and a mass storage having stored the document repository and a document index that indexes terms from documents stored in the document repository. - View Dependent Claims (22, 23, 24, 25)
-
Specification