Method and Apparatus for Identifying Synonyms and Using Synonyms to Search
First Claim
1. A method for identifying synonyms, the method comprising:
- obtaining, by a server, any two words to be identified;
determining that a shortest edit distance between the two words is less than or equal to an edit distance threshold;
determining whether both of the two words exist in a preset knowledge database;
if both of the two words exist in the preset knowledge database, then finding a smallest granularity type with highest weight value for each word in the knowledge database;
if the two words have the same smallest granularity type with highest weight value, then determining that the two words are synonyms; and
if the two words do not have the same smallest granularity type with highest weight value, then determining that the two words are non-synonyms.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and an apparatus for identifying synonym and utilizing such synonym to conduct search is disclosed. The disclosed method includes: obtaining arbitrary two words to be identified; determining whether a shortest edit distance between the two words less than or equal to an edit distance threshold; determining whether the two words to be identified exist in a preset knowledge database, and if an answer is yes then searching a smallest granularity type with highest weight value for each word in the knowledge database; and if the two word have the same smallest granularity type with highest weight value, then determining such two words are synonyms, or non-synonym otherwise. The disclosed techniques greatly improve accuracy of synonym identification and guarantee effect of synonym identification.
40 Citations
13 Claims
-
1. A method for identifying synonyms, the method comprising:
-
obtaining, by a server, any two words to be identified; determining that a shortest edit distance between the two words is less than or equal to an edit distance threshold; determining whether both of the two words exist in a preset knowledge database; if both of the two words exist in the preset knowledge database, then finding a smallest granularity type with highest weight value for each word in the knowledge database; if the two words have the same smallest granularity type with highest weight value, then determining that the two words are synonyms; and if the two words do not have the same smallest granularity type with highest weight value, then determining that the two words are non-synonyms. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for identifying synonyms, the apparatus comprising:
-
a retrieval unit, that obtains any two words to be identified; a first determination unit, that determines that a shortest edit distance between the two words is less than or equal to an edit distance threshold to inform a second determination unit; the second determination unit, that determines that both of the two words exist in a preset knowledge database and to inform an query unit; the query unit, that finds a smallest granularity type with highest weight value for each word in the knowledge database; a third determination unit, that determines that the two words are synonyms when the two words have the same smallest granularity type with highest weight value, and determines that the two words are non-synonyms when the two words do not have the same smallest granularity type with highest weight value. - View Dependent Claims (9, 10, 11, 12, 13)
-
Specification