System and method for determining semantically related terms
First Claim
1. A method for determining semantically related terms, comprising:
- receiving one or more seed terms;
searching a first index to determine a plurality of webpages associated with the seed terms, the first index comprising a plurality of terms and for each term of the plurality of terms, an association between one or more webpages and the term;
searching a second index to determine a plurality of potential terms associated with the plurality of webpages associated with the seed terms, the second index comprising a plurality of identifiers for webpages and for each webpages of the plurality of identifiers for webpages, an association between one or more terms and the webpage;
sending at least one term of the plurality of potential terms to a user to suggest the at least one term of the plurality of potential terms to the user;
receiving an indication of relevance of at least one suggested term to the user;
modifying with a processor the terms which comprise the seed terms based at least in part on the received indication of relevance;
receiving an indication that a first term is relevant to the user; and
modifying with a processor the seed terms to comprise the first term as a positive seed term;
wherein receiving one or more seed terms comprises;
receiving a location of a webpageretrieving with a processor the content of the webpage from the location of the webpage;
stripping code from the content of the webpage with a processor;
pulling one or more terms from the content of the webpage; and
weighing each term of the one or more terms pulled form the content of the webpage with a processor based on a location of where the term was located on the webpage.
9 Assignments
0 Petitions
Accused Products
Abstract
The present disclosure is directed to systems and methods for determining semantically related terms. Generally, one or more seed terms are received from a user. A system searches a first index comprising a plurality of terms and one or more webpages associated with each term of the plurality of terms to determine a plurality of webpages associated with the seed terms. The system then searches a second index comprising a plurality of webpages and one or more terms associated with each webpage of the plurality of webpages to determine a plurality of potential terms associated with the plurality of webpages associated with the seed terms. At least one term of the plurality of potential terms is suggested to a user.
-
Citations
23 Claims
-
1. A method for determining semantically related terms, comprising:
-
receiving one or more seed terms; searching a first index to determine a plurality of webpages associated with the seed terms, the first index comprising a plurality of terms and for each term of the plurality of terms, an association between one or more webpages and the term; searching a second index to determine a plurality of potential terms associated with the plurality of webpages associated with the seed terms, the second index comprising a plurality of identifiers for webpages and for each webpages of the plurality of identifiers for webpages, an association between one or more terms and the webpage; sending at least one term of the plurality of potential terms to a user to suggest the at least one term of the plurality of potential terms to the user; receiving an indication of relevance of at least one suggested term to the user; modifying with a processor the terms which comprise the seed terms based at least in part on the received indication of relevance; receiving an indication that a first term is relevant to the user; and modifying with a processor the seed terms to comprise the first term as a positive seed term; wherein receiving one or more seed terms comprises; receiving a location of a webpage retrieving with a processor the content of the webpage from the location of the webpage; stripping code from the content of the webpage with a processor; pulling one or more terms from the content of the webpage; and weighing each term of the one or more terms pulled form the content of the webpage with a processor based on a location of where the term was located on the webpage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-readable storage medium comprising a set of instructions for determining semantically related terms, the set of instructions to direct a processor to perform acts of:
-
receiving one or more seed terms; searching a first index to determine a plurality of webpages associated with the seed terms, the first index comprising a plurality of terms and for each term of the plurality of terms, an association between one or more webpages and the term; searching a second index to determine a plurality of potential terms associated with the plurality of webpages associated with the seed terms, the second index comprising a plurality of identifiers for webpages and for each webpages of the plurality of identifiers for webpages, an association between one or more terms and the webpage; sending at least one term of the plurality of potential terms to a user to suggest the at least one term of the plurality of potential terms to the user; receiving an indication of relevance of at least one suggested term to the user; modifying the terms which comprise the seed terms based at least in part on the received indication of relevance; receiving an indication that a second term is not relevant to the user; and
modifying the seed terms to comprise the second term as a negative seed term;wherein receiving one or more seed terms comprises; receiving a location of a webpage retrieving the content of the webpage from the location of the webpage; stripping code from the content of the webpage; pulling one or more terms from the content of the webpage; and weighing each term of the one or more terms pulled form the content of the webpage based on a location of where the term was located on the webpage. - View Dependent Claims (11, 12)
-
-
13. A system for determining semantically related terms comprising:
-
one or more database servers comprising at least one memory module storing a first database comprising first index and a second database comprising a second index, wherein the first index comprises a plurality of terms and for each term of the plurality of terms, an association between one or more webpages and the term and wherein the second index comprises a plurality of identifiers for webpages and for each webpage of the plurality of identifiers for webpages, an association between one or more terms and the website; and a keyword suggestion module running in conjunction with at least one processor of at least one server, the keyword suggestion module operative to access the first and second indexes stored at the one or more database servers, the keyword suggestion module configured to; receive one or more seed terms; search the first index to determine a plurality of webpages associated with the seed terms; search the second index to determine a plurality of potential terms associated with the plurality of webpages; send at least one of the plurality of potential terms to a user to suggest the at least one of the plurality of potential terms to the user; receive an indication of relevance of at least one suggested term to a user; modify the terms which comprise the seed terms based at least in part on the received indication of relevance; receive an indication that a first term is relevant to the user; and modify the seed terms to comprise the first term as a positive seed term; wherein to receive one or more seed terms, the keyword suggestion module is further configured to; receive a location of a webpage; retrieve content of the webpage from the location of the webpage; strip code from the content of the webpage; pull one or more terms from the content of the webpage; and weigh each term of the one or more terms pulled from the content of the webpage based on a location of where the term was located on the webpage. - View Dependent Claims (14, 15)
-
-
16. A method for determining semantically related terms, comprising:
-
receiving one or more seed terms; determining with a processor one or more potential terms semantically related to seed terms based on one or more vectors comprising entries regarding a plurality of webpages at an advertisement campaign managements, terms associated with each webpage of the plurality of webpages, and a number of terms associated with each webpage of the plurality of webpages; and sending with a processor at least a portion of the determined potential terms to a user to suggest the portion of the determined potential terms to the user; wherein determining one or more potential terms semantically related to seed terms based on one or more vectors comprises; creating a first set of vectors representing for each webpage at an advertisement campaign management system, whether a term at the advertisement campaign management system is associated with the webpage; creating a second set of vectors representing for each webpage at the advertisement campaign management system, at least a weight of each term associated with the webpage; and determining one or more potential terms semantically related to the seed terms based on the first and second sets of vectors, and the seed terms; and wherein determining one or more potential terms semantically related to the seed terms based on the first and second sets of vectors, and the seed terms comprises; calculating the equation;
T=Sum of (V1*cosine(V2,S)).wherein V1*cosine(V2,S) is calculated for a number of webpages at the advertisement campaign management system;
V1 is the vector of the first set of vectors indicating for each term at the advertisement campaign management system, whether a term is associated with the webpage;
V2 is the vector of the second set of vectors including for each term at the advertisement campaign management system, an entry indicating a weight of a term associated with the webpage;
S is a seed term vector indicating for each term at the advertisement campaign management system, whether the term is a seed term; and
T is a vector indicating for each term at the advertisement campaign management system, how relevant the term is to the seed terms. - View Dependent Claims (17, 18)
-
-
19. A computer-readable storage medium comprising a set of instructions for determining semantically related terms, the set of instructions to direct a processor to perform acts of:
-
receiving one or more seed terms; determining one or more potential terms semantically related to seed terms based on one or more vectors comprising entries regarding a plurality of webpages at an advertisement campaign managements, terms associated with each webpage of the plurality of webpages, and a number of terms associated with each webpage of the plurality of webpages; and sending at least a portion of the determined potential terms to a user to suggest the portion of the determined potential terms to the us; wherein determining one or more potential terms semantically related to seed terms based on one or more vectors comprises; creating a first set of vectors representing for each webpage at an advertisement campaign management system, whether a term at the advertisement campaign management system is associated with the webpage; creating a second set of vectors representing for each webpage at the advertisement campaign management system, at least a weight of each term associated with the webpage; and determining one or more potential terms semantically related to the seed terms based on the first and second sets of vectors, and the seed terms; and wherein determining one or more potential terms semantically related to the seed terms based on the first and second sets of vectors, and the seed terms comprises; calculating the equation;
T=Sum of (V1*cosine(V2,S)).wherein V1*cosine(V2,S) is calculated for a number of webpages at the advertisement campaign management system;
V1 is the vector of the first set of vectors indicating for each term at the advertisement campaign management system, whether a term is associated with the webpage;
V2 is the vector of the second set of vectors including for each term at the advertisement campaign management system, an entry indicating a weight of a term associated with the webpage;
S is a seed term vector indicating for each term at the advertisement campaign management system, whether the term is a seed term; and
T is a vector indicating for each term at the advertisement campaign management system, how relevant the term is to the seed terms. - View Dependent Claims (20, 21, 22, 23)
-
Specification