Detecting spam related and biased contexts for programmable search engines
First Claim
1. A computer-implemented method, comprising:
- maintaining in a data processing system a collection of context files for use by a search engine system, each context file provided by a third-party content provider to the search engine system, each context file containing commands to control operation of the search engine system in processing search query inputs;
receiving from a plurality of third-party content providers a plurality of new context files for addition to the collection, each of the providers providing at least one of the new context files, each new context file provided by the respective third-party content provider to the search engine system each new context file containing commands to control operation of the search engine system in processing search query inputs;
for a first new context file, determining that a number of spam web pages listed in the first new context file exceeds a threshold number and consequently not adding the first new context file to the collection;
receiving in a search engine a search query input from a user, the search query input having been received through an interface provided to the user by the third-party content provider;
identifying a first context file from the collection for processing the search query input;
using commands in the first context file to control an organization and a presentation of search results resulting from the processing of the search query input, including;
processing the search query input using the commands in the first context file to produce a context processed search query;
generating context processed search results responsive to the context processed search query; and
providing the context processed search results in accordance with the commands in the first context file, wherein the identifying, processing, generating and providing are performed by one or more processors.
2 Assignments
0 Petitions
Accused Products
Abstract
A programmable search engine system is programmable by a variety of different entities, such as client devices and vertical content sites to customize search results for users. Context files store instructions for controlling the operations of the programmable search engine. The context files are processed by various context processors, which use the instructions therein to provide various pre-processing, post-processing, and search engine control operations. Spam related and biased contexts and search results are identified using offline and query time processing stages, and the context files from vertical content providers associated with such spam and biased context and results are excluded from processing on direct user queries.
253 Citations
21 Claims
-
1. A computer-implemented method, comprising:
-
maintaining in a data processing system a collection of context files for use by a search engine system, each context file provided by a third-party content provider to the search engine system, each context file containing commands to control operation of the search engine system in processing search query inputs; receiving from a plurality of third-party content providers a plurality of new context files for addition to the collection, each of the providers providing at least one of the new context files, each new context file provided by the respective third-party content provider to the search engine system each new context file containing commands to control operation of the search engine system in processing search query inputs; for a first new context file, determining that a number of spam web pages listed in the first new context file exceeds a threshold number and consequently not adding the first new context file to the collection; receiving in a search engine a search query input from a user, the search query input having been received through an interface provided to the user by the third-party content provider; identifying a first context file from the collection for processing the search query input; using commands in the first context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the first context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; and providing the context processed search results in accordance with the commands in the first context file, wherein the identifying, processing, generating and providing are performed by one or more processors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method, comprising:
-
receiving in a search engine a search query input from a user, the search query input having been input received through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; determining that a spam measure of the context processed search results exceeds a spam threshold; and excluding spam search results, links, annotations, or other context related content included in the context file from the context processed search results and providing the resulting context processed search results in accordance with the commands in the context file, wherein the identifying, processing, generating and excluding are performed by one or more processors. - View Dependent Claims (10, 11)
-
-
12. A computer-implemented method comprising:
-
receiving in a search engine a search query input from a user through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; generating native processed search results responsive to the search query input; calculating a distance measure between the context processed search results and the native processed search results; determining that the distance measure exceeds a distance threshold and consequently removing, from the context processed search results, context processed search results that are not found in the native search results to produce filtered context processed search results; and providing the filtered context processed search results in response to the search query input, wherein the identifying, processing, and providing are performed by one or more processors. - View Dependent Claims (13)
-
-
14. A system comprising:
-
one or more computers; a computer-readable medium coupled to the one or more computers having instructions stored thereon executable by the one or more computers to cause the one or more computers to perform operations comprising; maintaining in a data processing system a collection of context files for use by a search engine system, each context file provided by a third-party content provider to the search engine system, each context file containing commands to control operation of the search engine system in processing search query inputs; receiving from a plurality of third-party content providers a plurality of new context files for addition to the collection, each of the providers providing at least one of the new context files, each new context file provided by the respective third-party content provider to the search engine system, each new context file containing commands to control operation of the search engine system in processing search query inputs; for a first new context file, determining that the number of spam web pages listed in the first new context file exceeds a threshold number and consequently not adding the first new context file to the collection; receiving in a search engine a search query input from a user, the search query input having been received through an interface provided to the user by the third-party content provider; identifying a first context file from the collection for processing the search query input; using commands in the first context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the first context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; and providing the context processed search results in accordance with the commands in the first context file.
-
-
15. A system comprising:
-
one or more computers; a computer-readable medium coupled to the one or more computers having instructions stored thereon executable by the one or more computers to cause the one or more computers to perform operations comprising; receiving in a search engine a search query input from a user, the search query input having been input received through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; determining that a spam measure of the context processed search results exceeds a spam threshold; including context content in the context processed search results provided to the user if the spam measure does not exceed the spam threshold; and excluding spam search results, links, annotations, or other context related content included in the context file from the context processed search results and providing the resulting context processed search results in accordance with the commands in the context file. - View Dependent Claims (16)
-
-
17. A system comprising:
-
one or more computers; a computer-readable medium coupled to the one or more computers having instructions stored thereon executable by the one or more computers to cause the one or more computers to perform operations comprising; receiving in a search engine a search query input from a user through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; generating native processed search results responsive to the search query input; calculating a distance measure between the context processed search results and the native processed search results; determining that the distance measure exceeds a distance threshold and consequently removing, from the context processed search results, context processed search results that are not found in the native search results to produce filtered context processed search results; and providing the filtered context processed search results in response to the search query input. - View Dependent Claims (18)
-
-
19. A computer-readable medium encoded with a computer program, the computer program comprising instructions that, when executed, operate to cause a computer to perform operations comprising:
-
maintaining in a data processing system a collection of context files for use by a search engine system, each context file provided by a third-party content provider to the search engine system, each context file containing commands to control operation of the search engine system in processing search query inputs; receiving from a plurality of third-party content providers a plurality of new context files for addition to the collection, each of the providers providing at least one of the new context files, each new context file provided by the respective third-party content provider to the search engine system each new context file containing commands to control operation of the search engine system in processing search query inputs; for a first new context file, determining that a number of spam web pages listed in the first new context file exceeds a threshold number and consequently not adding the first new context file to the collection; receiving in a search engine a search query input from a user, the search query input having been received through an interface provided to the user by the third-party content provider; identifying a first context file from the collection for processing the search query input; using commands in the first context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the first context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; and providing the context processed search results in accordance with the commands in the first context file, wherein the identifying, processing, generating and providing are performed by one or more processors.
-
-
20. A computer-readable medium encoded with a computer program, the computer program comprising instructions that, when executed, operate to cause a computer to perform operations comprising:
-
receiving in a search engine a search query input from a user, the search query input having been input received through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; determining that a spam measure of the context processed search results exceeds a spam threshold; and excluding spam search results, links, annotations, or other context related content included in the context file from the context processed search results and providing the resulting context processed search results in accordance with the commands in the context file, wherein the identifying, processing, generating and excluding are performed by one or more processors.
-
-
21. A computer-readable medium encoded with a computer program, the computer program comprising instructions that, when executed, operate to cause a computer to perform operations comprising:
-
receiving in a search engine a search query input from a user through an interface provided to the user by a third-party content provider; identifying a context file provided by the third-party content provider, the context file containing commands for processing the search query input; using the commands in the context file to control an organization and a presentation of search results resulting from the processing of the search query input, including; processing the search query input using the commands in the context file to produce a context processed search query; generating context processed search results responsive to the context processed search query; generating native processed search results responsive to the search query input; calculating a distance measure between the context processed search results and the native processed search results; determining that the distance measure exceeds a distance threshold and consequently removing, from the context processed search results, context processed search results that are not found in the native search results to produce filtered context processed search results; and providing the filtered context processed search results in response to the search query input, wherein the identifying, processing, and providing are performed by one or more processors.
-
Specification