System and method for automatically organizing and classifying businesses on the World-Wide Web
First Claim
1. A method for performing a search of network accessible content, the method comprising:
- traversing a network of web documents to identify business names and geographic data associated with the web documents;
identifying a business name and geographic data associated with a particular web page;
determining web page identifying data for the particular web page;
extracting the identified business name and geographic data associated with the particular web page;
based on the extracted business name and geographic data, accessing a business directory to determine a business category code that is associated with the extracted business name and geographic data;
storing the web page identifying data and the business category code in association with one another and within an entry in an electronic data store, thereby enabling identification of the web page identifying data when performing subsequent searches based on user queries associated with the business category code;
receiving a query from a user, the query being related to a business category and associated with at least one business category code;
searching the stored business category codes based on the received query;
identifying, within the electronic data store, one or more entries that include a business category code that is associated with the business category of the query; and
returning a result to the user, the result including web page identifying data included in the identified one or more entries.
9 Assignments
0 Petitions
Accused Products
Abstract
A method and search engine for classifying a source publishing a document on a portion of a network, includes steps of electronically receiving a document, based on the document, determining a source which published the document, and assigning a code to the document based on whether data associated with the document published by the source matches with data contained in a database. An intelligent geographic- and business topic-specific resource discovery system facilitates local commerce on the World-Wide Web and also reduces search time by accurately isolating information for end-users. Distinguishing and classifying business pages on the Web by business categories using Standard Industrial Classification (SIC) codes is achieved through an automatic iterative process.
-
Citations
23 Claims
-
1. A method for performing a search of network accessible content, the method comprising:
-
traversing a network of web documents to identify business names and geographic data associated with the web documents; identifying a business name and geographic data associated with a particular web page; determining web page identifying data for the particular web page; extracting the identified business name and geographic data associated with the particular web page; based on the extracted business name and geographic data, accessing a business directory to determine a business category code that is associated with the extracted business name and geographic data; storing the web page identifying data and the business category code in association with one another and within an entry in an electronic data store, thereby enabling identification of the web page identifying data when performing subsequent searches based on user queries associated with the business category code; receiving a query from a user, the query being related to a business category and associated with at least one business category code; searching the stored business category codes based on the received query; identifying, within the electronic data store, one or more entries that include a business category code that is associated with the business category of the query; and returning a result to the user, the result including web page identifying data included in the identified one or more entries. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer program stored on a computer readable medium for performing a search of network accessible content, the computer program comprising instructions that, when executed by a computer, cause the computer to:
-
traverse a network of web documents to identify business names and geographic data associated with the web documents; identify a business name and geographic data associated with a particular web page; determine web page identifying data for the particular web page; extract the identified business name and geographic data associated with the particular web page; based on the extracted business name and geographic data, access a business directory to determine a business category code that is associated with the extracted business name and geographic data; store the web page identifying data and business category code in association with one another and within an entry in an electronic data store, thereby enabling identification of the web page identifying data when performing subsequent searches based on user queries associated with the business category code; receive a query from a user, the query being related to a business category and associated with at least one business category code; search the stored business category codes based on the received query; identify, within the electronic data store, one or more entries that include a business category code that is associated with the business category of the query; and return a result to the user, the result including web page identifying data included in the identified one or more entries. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer system for performing a search of network accessible content, the system comprising:
-
means for traversing a network of web documents to identify business names and geographic data associated with the web documents; means for identifying a business name and geographic data associated with a particular web page; means for determining web page identifying data for the particular web page; means for extracting the identified business name and geographic data associated with the particular web page; means for based on the extracted business name and geographic data, accessing a business directory to determine a business category code that is associated with the extracted business name and geographic data; means for storing the web page identifying data and the business category code in association with one another and within an entry in an electronic data store, thereby enabling identification of the web page identifying data when performing subsequent searches based on user queries associated with the business category code; means for receiving a query from a user, the query being related to a business category and associated with at least one business category code; means for searching the stored business category codes based on the received query; means for identifying, within the electronic data store, one or more entries that include a business category code that is associated with the business category of the query; and means for returning a result to the user, the result including web page identifying data included in the identified one or more entries. - View Dependent Claims (21, 22, 23)
-
Specification