Acquisition of semantic class lexicons for query tagging
First Claim
1. A method performed by a computing device, the method comprising:
- creating seed distribution data comprising a seed phrase and associated seed probabilities that the seed phrase corresponds to a plurality of different lexicons having associated meanings, the seed probabilities including at least;
a first seed probability assigned to the seed phrase, the first seed probability indicative of a first likelihood that the seed phrase corresponds to a first lexicon associated with a first meaning of the seed phrase, anda second seed probability assigned to the seed phrase, the second seed probability indicative of a second likelihood that the seed phrase corresponds to a second lexicon associated with a second meaning of the seed phrase, the first seed probability different than the second seed probability, the first lexicon different than the second lexicon; and
identifying, from a plurality of web documents, a set of lists of phrases that include the seed phrase as well as other phrases; and
based at least on the seed probabilities, determining other probabilities that the other phrases correspond to the plurality of different lexicons having the associated meanings.
2 Assignments
0 Petitions
Accused Products
Abstract
A user'"'"'s search experience may be enhanced by providing additional content based upon an understanding of the user'"'"'s intent. Query tagging, the assigning of semantic labels to terms within a query, is one technique that may be utilized to determine the context of a user'"'"'s search query. Accordingly, as provided herein, a query tagging model may be updated using one or more stratified lexicons. A list data structure (e.g., lists of phrases obtained from web pages) and seed distribution data (e.g., pre-labeled probability data) may be used by a graph learning technique to obtain an expanded set of phrases and their respective probabilities of corresponding with particular lexicons (e.g., semantic class lexicons). The expanded set of phrases may be used to group phrases into stratified lexicons. The stratified lexicons may be used as features for updating and/or executing the query tagging model.
16 Citations
21 Claims
-
1. A method performed by a computing device, the method comprising:
-
creating seed distribution data comprising a seed phrase and associated seed probabilities that the seed phrase corresponds to a plurality of different lexicons having associated meanings, the seed probabilities including at least; a first seed probability assigned to the seed phrase, the first seed probability indicative of a first likelihood that the seed phrase corresponds to a first lexicon associated with a first meaning of the seed phrase, and a second seed probability assigned to the seed phrase, the second seed probability indicative of a second likelihood that the seed phrase corresponds to a second lexicon associated with a second meaning of the seed phrase, the first seed probability different than the second seed probability, the first lexicon different than the second lexicon; and identifying, from a plurality of web documents, a set of lists of phrases that include the seed phrase as well as other phrases; and based at least on the seed probabilities, determining other probabilities that the other phrases correspond to the plurality of different lexicons having the associated meanings. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computing device comprising:
-
at least one processing unit; and at least one memory storing instructions which, when executed by the at least one processing unit, cause the at least one processing unit to; identify a first seed phrase and associated first seed probabilities that the first seed phrase corresponds to a plurality of different semantic meanings; identify a second seed phrase and associated second seed probabilities that the second seed phrase corresponds to the plurality of different semantic meanings; identify, from a plurality of web documents, first lists of phrases that include the first seed phrase as well as first other phrases; identify, from the plurality of web documents, second lists of phrases that include the second seed phrase as well as second other phrases; based at least on the first seed probabilities, determine first other probabilities that the first other phrases correspond to the plurality of different semantic meanings; and based at least on the second seed probabilities, determine second other probabilities that the second other phrases correspond to the plurality of different semantic meanings. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computing device comprising:
-
at least one processing unit; and at least one memory storing instructions which, when executed by the at least one processing unit, cause the at least one processing unit to; identify seed probabilities that a seed phrase belongs to different semantic classes, the seed probabilities including at least; a first seed probability that the seed phrase belongs to a first semantic class, and a second seed probability that the seed phrase belongs to a second semantic class that is different than the first semantic class; identify, from a plurality of web documents, phrase lists that include the seed phrase as well as other phrases; and based at least on the seed probabilities, determine other probabilities that the other phrases included in the phrase lists belong to the different semantic classes, including at least first other probabilities that the other phrases belong to the first semantic class and second other probabilities that the other phrases belong to the second semantic class. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
Specification