Generating item clusters based on aggregated search history data
First Claim
1. A computer-implemented method comprising:
- under control of a computing device configured with specific computer-executable instructions;
accessing, from an electronic data store, search history data for a plurality of users of an electronic catalog, said search history data comprising session query logs, each session query log including one or more query terms associated with a search request submitted by a respective user;
analyzing the session query logs to generate one or more search session segments, each search session segment corresponding to a distinct search session and comprising one or more search requests;
for each respective search session segment, generating one or more query pairs, each query pair comprising at least two query terms that are associated with the one or more search requests comprised in the respective search session segment;
determining, for each respective query pair across all of the one or more search session segments, a query pair frequency indicative of how many times the respective query pair appears across all of the one or more search session segments;
clustering the query pairs, based at least in part on the determined query pair frequencies, into a query cluster, said query cluster comprising two or more query terms determined to be highly correlated based at least in part on the clustering;
accessing correlated attribute data, said correlated attribute data including two or more correlated attributes that tend to be correlated to each other;
generating an item descriptor cluster comprising combinations of the two or more query terms and the two or more correlated attributes;
generating, by executing a search request on the electronic catalog using the item descriptor cluster, an item cluster comprising items having at least one of the two or more correlated attributes; and
providing an item recommendation for at least one item contained in the item cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
The present disclosure provides computer-implemented systems and processes for clustering items and improving the utility of item recommendations. One process involves applying a clustering algorithm to users'"'"' search session queries over periods of time to generate query clusters comprising correlated query terms. Correlations may be based on, among other things, the frequency of which query term pairs appear together in a single search session. The generated query clusters may be used to generate item descriptor clusters indicative of items and/or types of items that may be complementary. Other criteria may be applied to the query and item clusters to generate variant clusters. For example, information such as related brands, market segments, and other data may be applied to item descriptor clusters to generate item clusters that include complementary items associated with or targeted for particular demographics. Item descriptor clusters and item clusters can be used to improve the item recommendations.
-
Citations
21 Claims
-
1. A computer-implemented method comprising:
under control of a computing device configured with specific computer-executable instructions; accessing, from an electronic data store, search history data for a plurality of users of an electronic catalog, said search history data comprising session query logs, each session query log including one or more query terms associated with a search request submitted by a respective user; analyzing the session query logs to generate one or more search session segments, each search session segment corresponding to a distinct search session and comprising one or more search requests; for each respective search session segment, generating one or more query pairs, each query pair comprising at least two query terms that are associated with the one or more search requests comprised in the respective search session segment; determining, for each respective query pair across all of the one or more search session segments, a query pair frequency indicative of how many times the respective query pair appears across all of the one or more search session segments; clustering the query pairs, based at least in part on the determined query pair frequencies, into a query cluster, said query cluster comprising two or more query terms determined to be highly correlated based at least in part on the clustering; accessing correlated attribute data, said correlated attribute data including two or more correlated attributes that tend to be correlated to each other; generating an item descriptor cluster comprising combinations of the two or more query terms and the two or more correlated attributes; generating, by executing a search request on the electronic catalog using the item descriptor cluster, an item cluster comprising items having at least one of the two or more correlated attributes; and providing an item recommendation for at least one item contained in the item cluster. - View Dependent Claims (2, 3, 4, 5)
-
6. A system comprising:
-
an electronic data store configured to at least store search history data associated with search requests submitted to an electronic catalog; and a computing system comprising one or more hardware computing devices, said computing system in communication with the electronic data store and configured to at least; analyze a plurality of submitted query terms associated with search session query logs to generate a plurality of query term pairs, each query term pair comprising at least two query terms that are associated with respective search requests associated with a single search session; determine, for each respective query pair, a query pair frequency indicative of how often the query pair appears in the search session query logs; cluster, based at least in part on the determined query pair frequencies, the query pairs into a query cluster, said query cluster comprising two or more query terms determined to be highly correlated based at least in part on said clustering; access related item attribute data, said related item attribute data including two or more correlated item descriptors for related items, wherein the two or more correlated item descriptors tend to be correlated to each other; generate an item descriptor cluster comprising combinations of the two or more query terms comprised in the query cluster and the two or more correlated item descriptors; generate, by executing a search request on the electronic catalog using the item descriptor cluster, an item cluster comprising two or more items, each item of the two or more items having an attribute corresponding to at least one of the correlated item descriptors; and providing an item recommendation comprising at least one of the two or more items. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer-implemented method comprising:
-
generating, based at least in part on a co-occurrence analysis of search histories associated with an electronic catalog, a cluster of item descriptors describing types of items in the electronic catalog that tend to be searched for in the electronic catalog in combination, the cluster of item descriptors including at least a first item descriptor representing a first item type and a second item descriptor representing a second item type; generating, based at least in part on the cluster of item descriptors, a plurality of item clusters, each item cluster including a plurality of items in the electronic catalog and including at least a first item of the first item type and a second item of the second item type; and providing a catalog item recommendation based at least in part on the plurality of item clusters, said method performed by an electronic catalog system comprising one or more computing devices. - View Dependent Claims (12, 13, 14, 15)
-
-
16. Non-transitory physical computer storage comprising computer-executable instructions stored thereon that, when executed by a hardware processor, are configured to perform operations comprising:
-
accessing, from an electronic data store that stores search session query log data, query pair frequency data indicative of how often a plurality of query pairs appear in search session logs associated with an electronic catalog; generating a query cluster based at least in part on the accessed query pair frequency data by clustering the query pairs, said query cluster comprising at least two query terms determined to be highly correlated based at least in part on said clustering; generating an item descriptor cluster comprising two or more query terms associated with the query cluster; executing a search request on the electronic catalog using the item descriptor cluster to generate an item cluster comprising at least one item having an attribute corresponding to at least one of the two or more query terms; and providing an item recommendation comprising the at least one item. - View Dependent Claims (17, 18, 19, 20, 21)
-
Specification