System and method for classifying an electronic document
First Claim
1. A method comprising:
- maintaining, within a database, information associated with a web page;
parsing, using at least one processor, the web page to identify an owner associated with the web page, wherein parsing the web page to identify the owner associated with the web page comprises identifying a name for the owner within text of the web page;
searching a third-party database for a listing associated with the identified owner;
when the third-party database includes a listing associated with the identified owner, utilizing the listing to determine a category associated with the web page, wherein the category comprises a standardized business classification category; and
assigning, within the database, the standardized business classification category to the web page.
7 Assignments
0 Petitions
Accused Products
Abstract
A method and search engine for classifying a source publishing a document on a portion of a network, includes steps of electronically receiving a document, based on the document, determining a source which published the document, and assigning a code to the document based on whether data associated with the document published by the source matches with data contained in a database. An intelligent geographic- and business topic-specific resource discovery system facilitates local commerce on the World-Wide Web and also reduces search time by accurately isolating information for end-users. Distinguishing and classifying business pages on the Web by business categories using Standard Industrial Classification (SIC) codes is achieved through an automatic iterative process.
-
Citations
26 Claims
-
1. A method comprising:
-
maintaining, within a database, information associated with a web page; parsing, using at least one processor, the web page to identify an owner associated with the web page, wherein parsing the web page to identify the owner associated with the web page comprises identifying a name for the owner within text of the web page; searching a third-party database for a listing associated with the identified owner; when the third-party database includes a listing associated with the identified owner, utilizing the listing to determine a category associated with the web page, wherein the category comprises a standardized business classification category; and assigning, within the database, the standardized business classification category to the web page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium including a set of instructions that, when executed, cause at least one processor to perform steps comprising:
-
maintaining, within a database, information associated with a web page; parsing the web page to identify an owner associated with the web page, wherein parsing the web page to identify the owner associated with the web page comprises identifying a name for the owner within text of the web page; searching a third-party business database for a listing associated with the identified owner; when the third-party database includes a listing associated with the identified owner, determining that the web page is a business page, wherein the category comprises a standardized business classification category; and assigning, within the database, the standardized business classification category to the web page. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A system, comprising:
-
at least one processor; and a computer-readable storage medium storing instructions thereon that, when executed by the at least one processor, cause the at least one processor to; maintain, within a database, information associated with a web page; parse the web page to identify an owner associated with the web page, wherein parsing the web page to identify the owner associated with the web page comprises identifying a name for the owner within text of the web page; search a third-party database for a listing associated with the identified owner; when the third-party database includes a listing associated with the identified owner, determine that the web page is associated with a first classification, wherein the first classification comprises a standardized business classification category; when the third-party database does not include a listing associated with the identified owner, determine that the web page is associated with a second classification; and assign, within the database, the first classification or the second classification to the web page. - View Dependent Claims (21, 22, 23, 24, 25, 26)
-
Specification