Apparatus and method for discovering context groups and document categories by mining usage logs
First Claim
1. An apparatus for relating user queries and documents, comprising:
- a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client;
wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs.
5 Assignments
0 Petitions
Accused Products
Abstract
An apparatus is provided for relating user queries and documents. The apparatus includes a client, a server, and a database being mutually coupled to a communications pathway. The client is configured to enable a user to submit user queries to locate documents. The server has a data mining mechanism configured to receive the user queries and generate information retrieval sessions. The database stores data in the form of usage logs generated from the information retrieval sessions. The data mining mechanism includes a clustering algorithm operative to identify context groups and usage categories. The data mining mechanism is operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs. A method is provided for associating user queries and documents in accordance with the apparatus.
-
Citations
31 Claims
-
1. An apparatus for relating user queries and documents, comprising:
-
a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client;
wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for relating user queries and documents using usage logs from retrieval sessions from a text retrieval system, comprising:
-
identifying contexts associated with a user query comprising at least one specific query keyword;
identifying user queries having similar query contexts;
partitioning user queries into groups based upon similarity of the query contexts;
merging the groups to compute multiple contexts associated with specific query keywords; and
applying a clustering algorithm to identify similar query contexts based upon the query keywords to generate context groups that associate keywords with documents accessed by users. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A method for associating user queries in the form of keywords and documents accessed from usage logs in response to submission of the user queries during user retrieval sessions from a text retrieval system, comprising:
-
identifying contexts associated with a user query comprising at least one specific query keyword;
identifying similar query contexts from individual user queries;
partitioning the user queries into groups based upon the identified similar query contexts;
associating the groups to identify at least one query context associated with each specific query keyword;
clustering similar query contexts based upon the query keywords to generate context groups that associate the keywords with documents accessed by users; and
graphically depicting the contexts associated with documents from most general to most specific. - View Dependent Claims (17, 18, 19, 20)
-
-
21. An apparatus for relating user queries and documents, comprising:
-
a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client, wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs, and wherein each query context comprises a set of all queries that belong to a connected component of G, wherein G is an undirected query graph with a vertex for each query in a usage log.
-
-
22. An apparatus for relating user queries and documents, comprising:
-
a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client, wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs, and wherein individual context groups comprise one or more query contexts, wherein queries are grouped solely based on corresponding sets of opened document IDs, with a query comprising at least one keyword.
-
-
23. An apparatus for relating user queries and documents, comprising:
-
a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client, wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs, and wherein each context group is represented as a directed acyclic graph (DAG) comprising a multi-level context DAG having at least one query context node and at least one general context node. - View Dependent Claims (24, 25)
-
-
26. An apparatus for relating user queries and documents, comprising:
-
a client configured to enable a user to submit user queries to locate documents;
a server having a data mining mechanism configured to receive the user queries and generate information retrieval sessions;
a communications pathway extending between the client and the server; and
a database provided in communication with the client and the server, the database storing data in the form of usage logs generated from the information retrieval sessions generated by a user at the client, wherein the data mining mechanism includes a clustering algorithm identifying context groups and usage categories, and operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs, and wherein individual queries are grouped based on corresponding sets of opened document IDs from the usage logs.
-
-
27. A method for relating user queries and documents using usage logs from retrieval sessions from a text retrieval system, comprising:
-
identifying contexts associated with a user query comprising at least one specific query keyword;
identifying user queries having similar query contexts;
partitioning user queries into groups based upon similarity of the query contexts;
merging the groups to compute multiple contexts associated with specific query keywords; and
applying a clustering algorithm to identify similar query contexts based upon the query keywords to generate context groups that associate keywords with documents accessed by users, wherein each query context is identified as a vector. - View Dependent Claims (28, 29)
-
-
30. A method for relating user queries and documents using usage logs from retrieval sessions from a text retrieval system, comprising:
-
identifying contexts associated with a user query comprising at least one specific query keyword;
identifying user queries having similar query contexts;
partitioning user queries into groups based upon similarity of the query contexts;
merging the groups to compute multiple contexts associated with specific query keywords; and
applying a clustering algorithm to identify similar query contexts based upon the query keywords to generate context groups that associate keywords with documents accessed by users, wherein the step of partitioning comprises representing each context group as a directed acyclic graph, with each context group containing query contexts. - View Dependent Claims (31)
-
Specification