Acquiring ontological knowledge from query logs
First Claim
Patent Images
1. A method of using a computer to modify a list of terms assigned to a semantic category, the method comprising:
- receiving an indication of a seed term;
using a computer to identify from a log of queries, a query having a set of query terms that includes at least one instance of the seed term in combination with a context word;
utilizing the computer to automatically identify, from the log of queries, a second query having a set of query terms that does not include the seed termbut does include the context word in combination with a different term, the different term being different than the seed term and the context term; and
utilizing the computer to remove the different term from the list of terms assigned to the semantic category, the different term being removed based on a determination that a significant enough context does not exist between the different term and the seed term, the significance of the context being determined utilizing the formula,
Score(c) =F_type{c}* log(g(c)/C),where g(c) =F_type{c}/F_inst{c},where C =F_type{ctopx}/F_inst{ctopx},where F_type is a frequency of context c in the semantic category,where F_inst is a frequency of context in an entire data, andwhere ctopx is x number of the most frequent contexts.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods are disclosed for acquiring ontological knowledge using query logs. In one embodiment, query logs are first utilized as a basis for identifying important contexts associated with terms belonging to a semantic category. Then, those contexts are as a basis for identifying new terms belonging to the same category or, in another embodiment, as a basis for removing extraneous or obsolete terms identified as being in the same category.
23 Citations
16 Claims
-
1. A method of using a computer to modify a list of terms assigned to a semantic category, the method comprising:
-
receiving an indication of a seed term; using a computer to identify from a log of queries, a query having a set of query terms that includes at least one instance of the seed term in combination with a context word; utilizing the computer to automatically identify, from the log of queries, a second query having a set of query terms that does not include the seed termbut does include the context word in combination with a different term, the different term being different than the seed term and the context term; and utilizing the computer to remove the different term from the list of terms assigned to the semantic category, the different term being removed based on a determination that a significant enough context does not exist between the different term and the seed term, the significance of the context being determined utilizing the formula,
Score(c) =F_type{c}* log(g(c)/C),where g(c) =F_type{c}/F_inst{c}, where C =F_type{ctopx}/F_inst{ctopx}, where F_type is a frequency of context c in the semantic category, where F_inst is a frequency of context in an entire data, and where ctopx is x number of the most frequent contexts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
Specification