GENERATING SYNONYMS BASED ON QUERY LOG DATA
First Claim
1. A method for generating synonyms using data processing functionality, comprising:
- receiving a set of input information items;
for each information item in the set of input information items;
generating a set of initial synonym candidates based on query log data;
removing noise from the set of initial synonym candidates, if said noise is present and can be identified, an output of said generating and removing comprising a set of filtered synonym candidates; and
reducing the set of filtered synonym candidates to a set of selected synonyms; and
outputting synonym-expanded data that is formed based on said reducing performed with respect to each information item in the set of input information items.
2 Assignments
0 Petitions
Accused Products
Abstract
An approach is described for generating synonyms to supplement at least one information item, such as, in one case, a set of related items. The approach can involve an expansion phase, a clean-up phase, and a reduction phase. In the expansion phase, the approach identifies, for each related item, a set of initial synonym candidates. In the clean-up phase, the approach removes noise from the set of initial synonym candidates (if such noise exists), to provide a set of filtered synonym candidate items. In the reduction phase, the approach ranks and applies a threshold (or thresholds) to the set of filtered synonym candidate items, to generate, for each information item, a set of selected synonyms. The approach uses query log data as at various points in its operation. The selected synonyms can be used to improve the effectiveness of user searches.
145 Citations
20 Claims
-
1. A method for generating synonyms using data processing functionality, comprising:
-
receiving a set of input information items; for each information item in the set of input information items; generating a set of initial synonym candidates based on query log data; removing noise from the set of initial synonym candidates, if said noise is present and can be identified, an output of said generating and removing comprising a set of filtered synonym candidates; and reducing the set of filtered synonym candidates to a set of selected synonyms; and outputting synonym-expanded data that is formed based on said reducing performed with respect to each information item in the set of input information items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A synonym-generating module, comprising:
-
an input module configured to receive a set of input information items; an expansion module configured to expand each information item in the set of input information items into a set of initial synonym candidates based on query log data, the query log data reflecting associations between prior queries submitted by users and page items accessed by the users in response to the submitted queries; a clean-up module configured to remove, for each information item, noise from the set of initial synonym candidates, if said noise is present and can be identified, an output of the expansion module and the clean-up module comprising a set of filtered synonym candidates; and a reduction module configured to select, for each information item, a set of synonyms from the set of filtered synonym candidates based on the query log data, to provide a set of selected synonyms; and an output module configured to output synonym-expanded data that is formed based on the selecting performed by the reduction module with respect to each information item in the set of input information items. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer-readable medium for storing computer-readable instructions, the computer-readable instructions providing a synonym-generating module when executed by one or more processing devices, the computer-readable instructions comprising:
-
logic configured to receive at least partially structured data, said at least partially structured data including a set of input information items, the set of input information items including one or more information items; and logic configured to generate, for each information item in the set of input information items, a set of selected synonyms based on query log data, the query log data reflecting associations between prior queries submitted by users and page items accessed by the users in response to the submitted queries.
-
Specification