Constructing a classifier for classifying queries
First Claim
Patent Images
1. A method of constructing a classifier, comprising:
- receiving a click graph correlating queries to items identified by the queries, wherein the click graph contains initial labeled queries that have been labeled with respect to predetermined query-intent classes that represent whether or not corresponding queries are associated with particular intents of users to search for particular information from particular categories of information that a user wishes to perform a search in, and unlabeled queries that have not been labeled with respect to the predetermined query-intent classes, wherein the click graph comprises query nodes representing respective queries that are assigned a positive query-intent class when a query node corresponds to a positive query-intent class, or a negative query intent class when the query node does not correspond to a query-intent class, uniform resource locator (URL) nodes representing respective URLs, and edges between at least some of the query nodes and URL nodes when a user has navigated to a URL node for a given query;
using the click graph to label at least some of the unlabeled queries with respect to the predetermined query-intent classes by inferring the query-intent classes of the unlabeled nodes from class memberships of the labeled query nodes based on similarity of click patterns from query nodes to correlated URL nodes; and
using queries in the click graph that have been labeled with respect to the predetermined query-intent classes as training data to train the classifier.
2 Assignments
0 Petitions
Accused Products
Abstract
To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.
48 Citations
18 Claims
-
1. A method of constructing a classifier, comprising:
-
receiving a click graph correlating queries to items identified by the queries, wherein the click graph contains initial labeled queries that have been labeled with respect to predetermined query-intent classes that represent whether or not corresponding queries are associated with particular intents of users to search for particular information from particular categories of information that a user wishes to perform a search in, and unlabeled queries that have not been labeled with respect to the predetermined query-intent classes, wherein the click graph comprises query nodes representing respective queries that are assigned a positive query-intent class when a query node corresponds to a positive query-intent class, or a negative query intent class when the query node does not correspond to a query-intent class, uniform resource locator (URL) nodes representing respective URLs, and edges between at least some of the query nodes and URL nodes when a user has navigated to a URL node for a given query; using the click graph to label at least some of the unlabeled queries with respect to the predetermined query-intent classes by inferring the query-intent classes of the unlabeled nodes from class memberships of the labeled query nodes based on similarity of click patterns from query nodes to correlated URL nodes; and using queries in the click graph that have been labeled with respect to the predetermined query-intent classes as training data to train the classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An article comprising:
-
at least one computer storage device containing instructions that when executed cause a computer to; receive an initial set of queries that have been labeled with respect to query intent classes; receive a collection of data in the form of a click graph that includes correlations between unlabeled queries and items identified by the queries, wherein the click graph has query nodes representing respective queries that are assigned a positive query-intent class that represents that corresponding queries are associated with particular intents of users to search for particular information from particular categories of information that a user wishes to perform a search in when a query node corresponds to a positive query-intent class, or a negative query-intent class when the query node does not correspond to a query-intent class, uniform resource locator (URL) nodes representing respective URLs, and edges between at least some of the query nodes and URL nodes when a user has navigated to a URL node for a given query and wherein the respective edges between queries and URLs have weights assigned to them based on counts of clicks from queries to URLs which are used to define click patterns; apply a graph-based learning algorithm to enable labeling of at least some of the unlabeled queries with respect to the query intent classes according to labeling in the initial set of queries by inferring the query-intent classes of the unlabeled nodes from class memberships of the labeled query nodes based on similarity of click patterns from query nodes to corresponding URL nodes; and create training data based on the initial set of labeled queries and the queries labeled by the graph-based learning algorithm to train a query intent classifier. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A system for constructing a classifier, comprising:
-
a general purpose computing device; a computer program comprising program modules executable by the general purpose computing device, wherein the computing device is directed by the program modules of the computer program to, receive a click graph that contains query nodes representing respective queries and second nodes representing items identified by the queries, wherein a subset of the query nodes have been initially labeled with respect to predetermined query-intent classes that represent whether or not corresponding queries are associated with particular intents of users to search for particular information from particular categories of information that a user wishes to perform a search in and that are assigned a positive query-intent class when a query node corresponds to a positive query-intent class, or a negative query intent class when the query node does not correspond to a query-intent class, and wherein a remainder of the query nodes initially are not labeled with respect to the predetermined query-intent classes; compute values representing likelihoods that the queries belong to the predetermined query-intent classes by inferring the query-intent classes of the unlabeled nodes from class memberships of the labeled query nodes based on similarity of user click patterns from query nodes to corresponding URL nodes; assign labels to at least some of the unlabeled query nodes to label the unlabeled query nodes with respect to the predetermined query-intent classes based on the computed values; and use the subset of the query nodes and the query nodes labeled based on the computed values as training data to train a classifier. - View Dependent Claims (18)
-
Specification