Identifying common co-occurring elements in lists
First Claim
Patent Images
1. A method comprising:
- receiving a pair of terms that includes a term and a candidate synonym of the term;
determining that the terms of the pair have been identified as being potentially non-synonymous;
evaluating the pair of terms using one or more penalty criteria that are only applied to terms that are identified as being potentially non-synonymous, comprising determining a quantity of switches that have occurred between queries that include the term and queries that include the candidate synonym of the term;
selecting, by one or more computers, a first threshold if the pair of terms satisfies the penalty criteria, or a second threshold if the pair of terms does not satisfy the penalty criteria, wherein the first threshold is a same threshold that is applied to pairs of terms that are not identified as being potentially non-synonymous, and wherein the second threshold is higher than the first threshold; and
determining that the terms of the pair are synonyms when the selected threshold is exceeded.
2 Assignments
0 Petitions
Accused Products
Abstract
One embodiment of the present invention provides a system for detecting correlations between terms. During operation, the system identifies one or more lists contained in one or more documents and identifies two terms co-occurring in the lists. The system further determines a correlation between the co-occurring terms, and places the co-occurring terms in a correlated-pair list based on the correlation.
97 Citations
15 Claims
-
1. A method comprising:
-
receiving a pair of terms that includes a term and a candidate synonym of the term; determining that the terms of the pair have been identified as being potentially non-synonymous; evaluating the pair of terms using one or more penalty criteria that are only applied to terms that are identified as being potentially non-synonymous, comprising determining a quantity of switches that have occurred between queries that include the term and queries that include the candidate synonym of the term; selecting, by one or more computers, a first threshold if the pair of terms satisfies the penalty criteria, or a second threshold if the pair of terms does not satisfy the penalty criteria, wherein the first threshold is a same threshold that is applied to pairs of terms that are not identified as being potentially non-synonymous, and wherein the second threshold is higher than the first threshold; and determining that the terms of the pair are synonyms when the selected threshold is exceeded. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving a pair of terms that includes a term and a candidate synonym of the term; determining that the terms of the pair have been identified as being potentially non-synonymous; evaluating the pair of terms using one or more penalty criteria that are only applied to terms that are identified as being potentially non-synonymous, comprising determining a quantity of switches that have occurred between queries that include the term and queries that include the candidate synonym of the term; selecting a first threshold if the pair of terms satisfies the penalty criteria, or a second threshold if the pair of terms does not satisfy the penalty criteria, wherein the first threshold is a same threshold that is applied to pairs of terms that are not identified as being potentially non-synonymous, and wherein the second threshold is higher than the first threshold; and determining that the terms of the pair are synonyms when the selected threshold is exceeded. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system comprising:
-
one or more computers; and a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving a pair of terms that includes a term and a candidate synonym of the term; determining that the terms of the pair have been identified as being potentially non-synonymous; evaluating the pair of terms using one or more penalty criteria that are only applied to terms that are identified as being potentially non-synonymous, comprising determining a quantity of switches that have occurred between queries that include the term and queries that include the candidate synonym of the term; selecting a first threshold if the pair of terms satisfies the penalty criteria, or a second threshold if the pair of terms does not satisfy the penalty criteria, wherein the first threshold is a same threshold that is applied to pairs of terms that are not identified as being potentially non-synonymous, and wherein the second threshold is higher than the first threshold; and determining that the terms of the pair are synonyms when the selected threshold is exceeded. - View Dependent Claims (12, 13, 14, 15)
-
Specification