Method and apparatus for document filtering using ensemble filters
First Claim
1. A process for creating an ensemble filter for selecting documents, comprising:
- identifying a set of documents for training;
identifying a first coherent set of documents from said training set of documents;
identifying a first profile corresponding to said first coherent set of documents;
identifying a second coherent set of documents and a remainder set of documents from said training set of documents using said first profile;
identifying at least one coherent set of documents from said remainder set of documents;
identifying at least one remainder profile corresponding to each of said identified coherent sets of documents from said remainder set of documents;
creating a first sub-filter using said first profile;
creating at least one remainder sub-filter using at least one of said remainder profiles; and
combining said first sub-filter with at least one remainder sub-filter to create an ensemble filter.
1 Assignment
0 Petitions
Accused Products
Abstract
A technique for representing an information need and employing one or more filters to select documents that satisfy the represented information need, including a technique of creating filters that involves (a) dividing a set of documents into one or more subsets such that each subset can be used as the source of features for creating a filtering profile or used to set or validate the score threshold for the profile and (b) determining whether multiple profiles are required and how to combine them to create an effective filter. Multiple profiles can be incorporated into an individual filter and the individual filters combined to create an ensemble filter. Ensemble filters can then be further combined to create meta filters.
-
Citations
27 Claims
-
1. A process for creating an ensemble filter for selecting documents, comprising:
-
identifying a set of documents for training;
identifying a first coherent set of documents from said training set of documents;
identifying a first profile corresponding to said first coherent set of documents;
identifying a second coherent set of documents and a remainder set of documents from said training set of documents using said first profile;
identifying at least one coherent set of documents from said remainder set of documents;
identifying at least one remainder profile corresponding to each of said identified coherent sets of documents from said remainder set of documents;
creating a first sub-filter using said first profile;
creating at least one remainder sub-filter using at least one of said remainder profiles; and
combining said first sub-filter with at least one remainder sub-filter to create an ensemble filter. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A process for selecting documents from a stream of documents, comprising:
-
identifying a set of documents for training;
identifying a first coherent set of documents from said training set of documents;
identifying a first profile corresponding to said first coherent set of documents;
identifying a second coherent set of documents and a remainder set of documents from said training set of documents using said first profile;
identifying at least one coherent set of documents from said remainder set of documents;
identifying at least one remainder profile corresponding to each of said identified coherent sets of documents from said remainder set of documents;
creating a first sub-filter using said first profile;
creating at least one remainder sub-filter using at least one of said remainder profiles;
combining said first sub-filter with at least one remainder sub-filter to create an ensemble filter; and
passing said stream of documents through said ensemble filter. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A process for selecting documents from a database of documents, comprising:
-
identifying a set of documents for training;
identifying a first coherent set of documents from said training set of documents;
identifying a first profile corresponding to said first coherent set of documents;
identifying a second coherent set of documents and a remainder set of documents from said training set of documents using said first profile;
identifying at least one coherent set of documents from said remainder set of documents;
identifying at least one remainder profile corresponding to each of said identified coherent sets of documents from said remainder set of documents;
creating a first sub-filter using said first profile;
creating at least one remainder sub-filter using at least one of said remainder profiles;
combining said first sub-filter with at least one remainder sub-filter to create an ensemble filter; and
applying said ensemble filter to said database to select documents. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification