IDENTIFYING SYNONYMS OF ENTITIES USING WEB SEARCH
First Claim
1. A method of identifying synonyms of an entity name, the method comprising:
- transmitting a search term to a web server, the search term being a candidate string selected from tokens of the entity namereceiving a plurality of search results from the web server, the search results including at least one of a title, a uniform resource locator (URL), and a snippet of content of a web page;
identifying tokens of the entity name that are included in the plurality of search results;
generating a score for the plurality of search results based on the identified tokens;
comparing the generated score to a threshold value used to indicate whether the candidate string is a synonym of the entity name; and
storing the candidate string as a synonym when the generated score at least reaches the threshold value.
2 Assignments
0 Petitions
Accused Products
Abstract
Identifying synonyms of entities using web search results is disclosed herein. In some aspects, a candidate string of tokens of an entity name is selected as a search term. The search term is transmitted by a server to a search engine, which in turn, transmits search results back to the server after performing a search. The server analyzes the search results, generates a score based on the search results, and then determines a status (synonym or not a synonym) of the candidate string based on the score. In further aspects, additional candidate strings are designated as synonyms or not synonyms based on status of the searched candidate string by using relationships of a lattice formed from all possible candidate strings of the entity name.
-
Citations
20 Claims
-
1. A method of identifying synonyms of an entity name, the method comprising:
-
transmitting a search term to a web server, the search term being a candidate string selected from tokens of the entity name receiving a plurality of search results from the web server, the search results including at least one of a title, a uniform resource locator (URL), and a snippet of content of a web page; identifying tokens of the entity name that are included in the plurality of search results; generating a score for the plurality of search results based on the identified tokens; comparing the generated score to a threshold value used to indicate whether the candidate string is a synonym of the entity name; and storing the candidate string as a synonym when the generated score at least reaches the threshold value. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. One or more computer-readable media storing computer-executable instructions that, when executed on one or more processors, causes the one or more processors to perform acts comprising:
-
receiving search results for a candidate string from a search engine, the candidate string selected from a unique combination of tokens of an entity name; generating a score for the candidate string based on instances of tokens present in the search results to determine a status of the candidate string of the entity name, the status being at least one of a synonym or not a synonym; and storing the candidate string as a synonym when the score at least reaches a threshold value. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method, comprising:
-
creating a lattice of candidate strings selected from tokens of an entity name to establish a hierarchical relationship between the candidate strings; sending a first candidate string of the lattice to a web search engine to perform a web search, the web search to return web search results; generating a score based on instances of the tokens of the entity name included in the web search results; designating the first candidate string as a either a synonym of the entity name when the score at least reaches a threshold value or as not a synonym; and updating the lattice with the status of the first candidate string. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification