Method and apparatus for categorizing and presenting documents of a distributed database
First Claim
1. A search engine and database for a distributed database, comprising:
- at least one memory device, comprising, at least one Internet cache; and
an Internet index;
a computing apparatus, comprising, a crawler in communication with the Internet cache and an Internet;
an indexer in communication with the Internet index and the at least one Internet cache;
a transactional score generator in communication with the Internet cache; and
a category assignor in communication with the Internet cache;
a search server in communication with the Internet cache, the Internet index; and
a user interface in communication with the search server.
10 Assignments
0 Petitions
Accused Products
Abstract
Described herein are methods for creating categorized documents, categorizing documents in a distributed database and categorizing Resulting Pages. Also described herein is an apparatus for searching a distributed database. The method for creating categorized documents generally comprises: initially assuming all documents are of type 1; filtering out all type 2 documents and placing them in a first category; filtering out all type 3 documents and placing them in a second category; and defining all remaining documents as type 4 documents and placing all type 4 documents in a third category. The apparatus for searching a distributed database generally comprises at least one memory device; a computing apparatus; an indexer; a transactional score generator; and a category assignor; a search server; and a user interface in communication with the search server.
180 Citations
39 Claims
-
1. A search engine and database for a distributed database, comprising:
-
at least one memory device, comprising, at least one Internet cache; and
an Internet index;
a computing apparatus, comprising, a crawler in communication with the Internet cache and an Internet;
an indexer in communication with the Internet index and the at least one Internet cache;
a transactional score generator in communication with the Internet cache; and
a category assignor in communication with the Internet cache;
a search server in communication with the Internet cache, the Internet index; and
a user interface in communication with the search server. - View Dependent Claims (2, 3)
-
-
4. A method for searching a distributed database, comprising:
-
(a) entering search terms or phrases into a system;
(b) generating documents containing keywords that match the search terms or phrases;
(c) categorizing search results into categories according to categorization criteria to create categorized documents; and
(d) presenting the categorized documents. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A method for categorizing documents in a distributed database to create categorized documents, the method comprising:
-
initially assuming all documents are of type 1;
filtering out all type 2 documents and placing them in a first category;
filtering out all type 3 documents and placing them in a second category; and
defining all remaining documents as type 4 documents and placing all type 4 documents in a third category. - View Dependent Claims (10)
-
-
11. A method for categorizing Resulting Pages into categories, comprising:
-
designating a first category as commercial pages and a second category as informational pages;
determining a quality score q(wi) for each Resulting Page;
determining a transactional rating for each Resulting Page τ
(wi);
deriving a propagation matrix;
Pdetermining a commercial score κ
for each Resulting Page;
filtering out all Resulting Pages that meet or exceed a commercial score threshold value;
wherein the Resulting Pages that meet or exceed the commercial page threshold value are placed in the first category and all remaining Resulting Pages are placed in the second category. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
-
35. A method for categorizing a plurality of Resulting Pages into categories, comprising:
-
determining whether each of the plurality of Resulting Page is a spam page;
determining a quality score q(wi) for each of the plurality of Resulting Pages;
determining a transactional rating τ
(wi) for each of the plurality of Resulting Pages;
deriving a propagation matrix P;
determining a commercial score κ
for each of the plurality of Resulting Pages;
filtering out all spam-inclusive commercial pages from the plurality of Resulting Pages;
filtering out all spam pages from the spam-inclusive commercial pages;
placing all commercial pages in a commercial category; and
placing all remaining Resulting Pages into an information category.
-
-
36. A method for categorizing documents in a distributed database, comprising:
-
assuming all documents in the distributed database are non-commercial in nature;
filtering out all documents that are commercial in nature from the documents, wherein the documents that are commercial in nature are commercial documents; and
creating sales leads from the commercial documents. - View Dependent Claims (37, 38, 39)
-
Specification