System and method for extracting content for submission to a search engine
First Claim
Patent Images
1. A method for providing a Web page to a search engine, comprising:
- separating non-essential code from essential content of the Web page;
extracting said essential content from the Web page; and
providing said essential content of the Web page to the search engine.
8 Assignments
0 Petitions
Accused Products
Abstract
A system and a method for automatically submitting Web pages to a search engine, which is preferably used for submitting dynamic Web pages, but may optionally be used for any type of Web page. The present invention features a gateway server for providing these Web pages to the search engine, either directly or optionally through an autonomous software search program. Optionally and more preferably, the gateway server modifies the Web page before serving it to the autonomous software search program and/or search engine.
-
Citations
37 Claims
-
1. A method for providing a Web page to a search engine, comprising:
-
separating non-essential code from essential content of the Web page;
extracting said essential content from the Web page; and
providing said essential content of the Web page to the search engine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for providing a Web page for indexing, comprising:
-
(a) a gateway Web server for modifying the Web page for enabling indexing to be performed; and
(b) a search engine for performing indexing. - View Dependent Claims (20, 21, 22, 23)
-
-
24. A method for extracting unique content from a Web page, comprising:
-
determining a pattern of at least one repetitive element within at least the Web page; and
extracting the unique content from the Web page according to said pattern. - View Dependent Claims (25, 26, 27, 28)
-
-
29. A method for extracting an element from a Web page for serving the element to a search engine, the method comprising:
-
analyzing a structure of the Web page;
learning to extract the element from the Web page;
producing a set of instructions to extract the element;
extracting the element from a plurality of Web pages; and
creating a structured representation of the content of said plurality of Web pages for submission to the search engine. - View Dependent Claims (30, 31, 32, 33)
-
-
34. A method for feeding information about a plurality of Web pages to a search engine, comprising:
-
extracting at least one field from the plurality of Web pages;
automatically generating feed information for being fed to the search engine from said at least one field;
receiving information about a template common to the plurality of Web pages;
merging said template with said feed information; and
transmitting said feed information to the search engine. - View Dependent Claims (35, 36, 37)
-
Specification