Retrofitting recommender system for achieving predetermined performance requirements
First Claim
1. A computer-implemented method for retrofitting a recommender system to accommodate voluminous data organized into records having dimensions comprising:
- pre-processing the data by reducing the number of records to provide pre-processed data such that both a predetermined accuracy threshold and a predetermined performance requirement are met;
wherein reducing the data by the number of records is stopped specifically as a result of meeting a predetermined accuracy threshold, and pre-processing further comprises repeatedly removing a highest-frequency dimension from the data until the predetermined performance requirement is met; and
, providing the pre-processed data to the recommender system to generate predictions based on a query and the pre-processed data as reduced in pre-processing.
2 Assignments
0 Petitions
Accused Products
Abstract
Retrofitting recommender systems, so that they can scale to large data, is disclosed. The principal notion is to reduce data requirements of existing recommender engines by performing a type of data reduction that minimizes the loss of information given the engine. The reductions covered in this invention are designed to be easily implemented on a database system, and are intended to have minimal impact on an existing implementation of a recommender system. In one embodiment, a method repeats reducing the data by a number of records, until an accuracy threshold or a performance requirement is met. If the accuracy threshold is met first, the method repeats removing a highest-frequency dimension from the data, until the performance requirement is also met. The reduced data is provided to the recommender system, which generates predictions based on the reduced data, and a query. Any dimension previously removed from the data is subsequently added back to the predictions, if the dimension is not already part of the query. In other embodiments, clustering of the data and/or the query is also performed as an alternative mechanism for reduction. An associated method for post-processing recommendations to adjust for clustering is also presented.
-
Citations
20 Claims
-
1. A computer-implemented method for retrofitting a recommender system to accommodate voluminous data organized into records having dimensions comprising:
-
pre-processing the data by reducing the number of records to provide pre-processed data such that both a predetermined accuracy threshold and a predetermined performance requirement are met;
wherein reducing the data by the number of records is stopped specifically as a result of meeting a predetermined accuracy threshold, and pre-processing further comprises repeatedly removing a highest-frequency dimension from the data until the predetermined performance requirement is met; and
,providing the pre-processed data to the recommender system to generate predictions based on a query and the pre-processed data as reduced in pre-processing. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method for retrofitting a recommender system to accommodate voluminous data that is organized into records and dimensions comprising:
-
a) pre-processing the data by i) adding dimensions to the data corresponding to a plurality of clusters over the data, where each record resides in one of the clusters;
ii) reducing the data and evaluating both a predetermined accuracy threshold and a predetermined performance requirement;
iii) continuing to reduce the data by a number of records until the predetermined accuracy threshold is met; and
iv) repeatedly removing a highest-frequency non-cluster-corresponding dimension from the data until the predetermined performance requirement is met; and
,b) providing the data as reduced in pre-processing to the recommender system to generate predictions based on a query and the data as reduced in pre-processing. - View Dependent Claims (6)
-
-
7. A computer-implemented method for retrofitting a recommender system to accommodate voluminous data that is organized into records and dimensions comprising:
-
a) pre-processing the data by i) adding dimensions to the data and a query to be processed, said dimensions corresponding to a plurality of clusters over the data, where each record and said query resides in one of the clusters;
ii)reducing the data and evaluating both a predetermined accuracy threshold and a predetermined performance requirement;
iii) repeatedly reducing the data by a number of records until one of the predetermined accuracy threshold and the predetermined performance requirement is first met; and
iv) adding as a dimension to the query a cluster to which the query belongs and, b) providing the data as reduced in pre-processing to the recommender system to generate predictions based on a query and the data as reduced in pre-processing. - View Dependent Claims (8)
-
-
9. A computer-implemented method for retrofitting a recommender system to accommodate voluminous data organized into records having dimensions comprising:
-
pre-processing the data by reducing the number of records to provide pre-processed data such that both a predetermined accuracy threshold and a predetermined performance requirement are met;
wherein pre-processing the data comprises repeatedly reducing the data by a number of records until one of the predetermined accuracy threshold and the predetermined performance requirement is first met;
wherein pre-processing the data further comprises adding dimensions to the data corresponding to a plurality of clusters over the data, where each record resides in one of the clusters;
providing the pre-processed data to the recommender system to generate predictions based on a query and the pre-processed data as reduced in pre-processing;
and post-processing the predictions by expanding each dimension of the predictions corresponding to a cluster to a plurality of dimensions distributed in the cluster, and removing each dimension from the predictions that is also found in the query.
-
-
10. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method for retrofitting a recommender system comprising:
-
repeatedly reducing data organized into records and dimensions by a number of records until one of a predetermined accuracy threshold and a predetermined performance requirement is first met;
upon first meeting the predetermined accuracy threshold, repeating removing a highest-frequency dimension from the data until the predetermined performance requirement is also met;
providing the data wherein at least one of records and dimensions have been removed therefrom to the recommender system to generate predictions also based on a query; and
,adding any dimension previously removed from the data to also meet the predetermined performance requirement to the predictions upon determining that the dimension is mutually exclusive with dimensions of the query. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method for retrofitting a recommender system comprising:
-
repeatedly reducing data organized into records and dimensions by a number of records until one of a predetermined accuracy threshold and a predetermined performance requirement is first met;
adding dimensions to the data corresponding to a plurality of clusters over the data, where each record resides in one of the clusters;
upon first meeting the predetermined accuracy threshold, repeatedly removing a highest-frequency non-cluster-corresponding dimension from the data until the predetermined performance requirement is also met;
providing the data as at least one of records and dimensions have been removed to the recommender system to generate predictions also based on a query;
adding any dimension previously removed from the data to also meet the predetermined performance requirement to the predictions upon determining that the dimension is mutually exclusive with dimensions of the query; and
,expanding any dimension of the predictions corresponding to a cluster to a plurality of dimensions distributed in the cluster, and removing any dimension from the predictions that is also found in the query. - View Dependent Claims (17, 18)
-
-
19. A machine-readable medium having instructions stored thereon for execution by a processor to perform a method for retrofitting a recommender system comprising:
-
a) pre-processing the data by i) adding dimensions to the data corresponding to a plurality of clusters over the data, where each record resides in one of the clusters;
ii)reducing the data and evaluating both a predetermined accuracy threshold and a predetermined performance requirement;
iii) continuing to reduce the data by a number of records until the predetermined accuracy threshold is met; and
iv) repeatedly removing a highest-frequency non-cluster-corresponding dimension from the data until the predetermined performance requirement is met; and
,b) providing the data as reduced in pre-processing to the recommender system to generate predictions based on a query and the data as reduced in pre-processing. - View Dependent Claims (20)
-
Specification