Techniques for navigational query identification
First Claim
1. A method comprising performing a machine-executed operation involving instructions for identifying a navigational query, wherein the machine-executed operation is at least one of:
- A) sending said instructions over transmission media;
B) receiving said instructions over transmission media;
C) storing said instructions onto a machine-readable storage medium; and
D) executing the instructions;
wherein said instructions are instructions which, when executed by one or more processors, cause performance of;
determining whether a query is a navigational query byreceiving a set of query-URL pair-wise features based at least in part on said query in conjunction with an associated query result set;
integrating subsets of said set of query-URL pair-wise features to generate a set of query-based features;
automatically selecting, from said set of query-based features, a subset of most effective features for identifying navigational queries, wherein said selecting is based on a machine learning feature selection method;
based on said subset of most effective features, using a machine learning classification method to determine whether said query is a navigational query.
9 Assignments
0 Petitions
Accused Products
Abstract
To accurately classify a query as navigational, thousands of available features are explored, extracted from major commercial search engine results, user Web search click data, query log, and the whole Web'"'"'s relational content. To obtain the most useful features for navigational query identification, a three level system is used which integrates feature generation, feature integration, and feature selection in a pipeline. Because feature selection plays a key role in classification methodologies, the best feature selection method is coupled with the best classification approach to achieve the best performance for identifying navigational queries. According to one embodiment, linear Support Vector Machine (SVM) is used to rank features and the top ranked features are fed into a Stochastic Gradient Boosting Tree (SGBT) classification method for identifying whether or not a particular query is a navigational query.
-
Citations
18 Claims
-
1. A method comprising performing a machine-executed operation involving instructions for identifying a navigational query, wherein the machine-executed operation is at least one of:
-
A) sending said instructions over transmission media; B) receiving said instructions over transmission media; C) storing said instructions onto a machine-readable storage medium; and D) executing the instructions; wherein said instructions are instructions which, when executed by one or more processors, cause performance of; determining whether a query is a navigational query by receiving a set of query-URL pair-wise features based at least in part on said query in conjunction with an associated query result set; integrating subsets of said set of query-URL pair-wise features to generate a set of query-based features; automatically selecting, from said set of query-based features, a subset of most effective features for identifying navigational queries, wherein said selecting is based on a machine learning feature selection method; based on said subset of most effective features, using a machine learning classification method to determine whether said query is a navigational query. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
Specification