SYNOPSIS OF A SEARCH LOG THAT RESPECTS USER PRIVACY
First Claim
1. In a computing environment, a method comprising, processing a search log, including determining which queries from the search log correspond to information that is safe to publish, and for each of the queries having information that is safe to publish, publishing the information.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is releasing output data representing a search log, in which the data is suitable for most data mining/analysis applications, but is safe to publish by preserving user privacy. The search log is processed such that a query is only included if a sufficient count of that query is present; noise may be added. User contributions that are considered may be limited to a maximum number of queries. The output may indicate how often (possibly plus noise) that each query appeared. Other output may comprise a query-action graph, a query-inaction graph and/or a query-reformulation graph, with nodes representing queries and nodes representing actions, inactions or reformulations (e.g., clicked URLs, skipped URLs, or selected related queries), and edges between nodes representing action, skip or selection counts (possibly plus noise). The output may correspond to the top results/related queries returned from a search.
32 Citations
20 Claims
- 1. In a computing environment, a method comprising, processing a search log, including determining which queries from the search log correspond to information that is safe to publish, and for each of the queries having information that is safe to publish, publishing the information.
- 11. In a computing environment, a system comprising, a transformation mechanism that processes a search log into output data, including determining which queries of the search log are sufficiently frequent to publish so that each user'"'"'s privacy is preserved, in which at least some of the output data comprises a graph having query nodes representing queries, and other nodes representing an action, inaction or reformulation, with an edge between each query node and other node representing a corresponding count as to how many times the action, inaction or reformulation followed the query, respectively.
-
17. One or more computer-readable media having computer-executable instructions, which when executed perform steps, comprising:
-
counting queries in a search log or subset of the search log to obtain a count value for each query; adding zero, positive or negative noise to each count value to obtain a noisy count for each query; determining whether each noisy count meets a threshold, and if so, including the query corresponding to that noisy count in an output set cleared for release; and processing the output set into output data representative of the search log. - View Dependent Claims (18, 19, 20)
-
Specification