Disambiguation of entities
First Claim
1. A computer-implemented method of disambiguating entities using a computing system having processor, memory, and data storage subsystems, the computer-implemented method comprising:
- receiving a user input search query;
detecting that ambiguity exists in an entity within the search query;
determining multiple senses that exist within the detected ambiguous entity;
for each of the multiple senses, computing an amount of network traffic to a webpage that represents one of the multiple senses, wherein computing includes calculating a number of webpage views of the webpage and a dwell time for each of the webpage views of the webpage;
computing a total amount of network traffic to all webpages that represent at least one of the multiple senses;
for each of the multiple senses, calculating a probability based on the amount of network traffic to the webpage that represents the one of the multiple senses and the total amount of network traffic to all the webpages that represent at least one of the multiple senses;
identifying a most probable sense of the multiple senses of the detected ambiguous entity, wherein the most probable sense has a highest probability compared to remaining senses of the multiple senses of the detected ambiguous entity; and
returning search results for the most probable sense of the multiple senses of the detected ambiguous entity based on the probability calculated for each of the multiple senses from the amount of network traffic to the webpage that represents the one of the multiple senses and the total amount of network traffic to all the webpages that represent at least one of the multiple senses.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, algorithms, and media are provided for disambiguating entities present in a received search query. Lists of categories from semi-structured data from external sites as well as internal sources are used to detect if ambiguity exists in an entity within the search query. Multiple senses or categories of the ambiguous entity are determined by ascertaining the primary intent of an entity extracted from a main term of a document. The probability of each sense is calculated by computing a total amount of traffic received for each of the senses of the ambiguous entity. The sense with the highest amount of computed traffic is the most probable determined sense.
17 Citations
20 Claims
-
1. A computer-implemented method of disambiguating entities using a computing system having processor, memory, and data storage subsystems, the computer-implemented method comprising:
-
receiving a user input search query; detecting that ambiguity exists in an entity within the search query; determining multiple senses that exist within the detected ambiguous entity; for each of the multiple senses, computing an amount of network traffic to a webpage that represents one of the multiple senses, wherein computing includes calculating a number of webpage views of the webpage and a dwell time for each of the webpage views of the webpage; computing a total amount of network traffic to all webpages that represent at least one of the multiple senses; for each of the multiple senses, calculating a probability based on the amount of network traffic to the webpage that represents the one of the multiple senses and the total amount of network traffic to all the webpages that represent at least one of the multiple senses; identifying a most probable sense of the multiple senses of the detected ambiguous entity, wherein the most probable sense has a highest probability compared to remaining senses of the multiple senses of the detected ambiguous entity; and returning search results for the most probable sense of the multiple senses of the detected ambiguous entity based on the probability calculated for each of the multiple senses from the amount of network traffic to the webpage that represents the one of the multiple senses and the total amount of network traffic to all the webpages that represent at least one of the multiple senses. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer hardware storage media containing computer readable instructions for an algorithm embodied thereon that, when executed by a computing device, perform steps for disambiguating entities, the algorithm comprising:
-
detecting that ambiguity exists for an entity obtained from a search query; determining senses that exist within the detected ambiguous entity; for each of the senses, computing an amount of network traffic to a webpage that represents one of the senses, wherein computing includes calculating a number of webpage views of the webpage and a dwell time for each of the webpage views of the webpage; computing a total amount of network traffic to all webpages that represent at least one of the senses; for each of the senses, calculating a probability based on the amount of network traffic to the webpage that represents the one of the senses and the total amount of network traffic to all the webpages that represent at least one of the senses; identifying a most probable sense of the senses, wherein the most probable sense has a highest probability compared to remaining senses of the detected ambiguous entity; and returning search results for the most probable sense of the senses of the detected ambiguous entity based on the probability calculated for each of the senses from the amount of network traffic to the webpage that represents the one of the senses and the total amount of network traffic to all the webpages that represent at least one of the senses. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A computerized system comprising:
-
one or more processors; and a non-transitory computer storage media storing computer-useable instructions that, when used by the one or more processors, cause the one or more processors to; receive a search query from a user input via an interconnected computing network of the computing system; identify an ambiguous term in the search query by utilizing lists of categories from semi-structured data containing the ambiguous term; infer categories of the ambiguous term via extraction on the semi-structured data; for each of the categories inferred for the ambiguous term, compute an amount of network traffic, wherein computing includes calculating a number of webpage views representing each category and a dwell time for each of the webpage views; compute a total amount of network traffic to all of the webpages representing the categories inferred; determine a probability for each category of the ambiguous term based on the amount of network traffic computed for each of the categories and the total amount of network traffic computed for all categories inferred of the ambiguous entity; identify a most probable category of the ambiguous term, wherein the most probable category has a highest probability compared to remaining categories of the ambiguous term; and return search results representing the most probable category of the ambiguous term to a user via a graphical user interface of the computing system based on the probability calculated for each category of the ambiguous term from the amount of network traffic computed for each of the categories and the total amount of network traffic computed for all categories inferred of the ambiguous entity. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification