System and method for identifying an owner of a web page on the World-Wide Web
First Claim
Patent Images
1. A method comprising:
- maintaining a database associated with one or more web pages;
retrieving, using at least one processor, a web page;
parsing the web page to identify a first portion of the web page and a second portion of the web page;
analyzing the first portion of the web page to identify a first potential name within the first portion of the web page;
analyzing the second portion of the web page to identify a second potential name within the second portion of the web page;
comparing the first potential name and the second potential name to identify, without user intervention, a name of an owner associated with the web page; and
associating, within the database, the name of the owner with the web cage.
7 Assignments
0 Petitions
Accused Products
Abstract
One or more embodiments of the disclosure include systems and methods for obtaining information from electronic documents (e.g., web pages). Example embodiments include retrieving an electronic document, parsing the electronic document to identify multiple portions of the electronic document, and comparing the portions to identify information about the electronic document, such as the owner of the electronic document. Further, the identified information can be associated with the electronic document within a database.
67 Citations
22 Claims
-
1. A method comprising:
-
maintaining a database associated with one or more web pages; retrieving, using at least one processor, a web page; parsing the web page to identify a first portion of the web page and a second portion of the web page; analyzing the first portion of the web page to identify a first potential name within the first portion of the web page; analyzing the second portion of the web page to identify a second potential name within the second portion of the web page; comparing the first potential name and the second potential name to identify, without user intervention, a name of an owner associated with the web page; and associating, within the database, the name of the owner with the web cage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 21)
-
-
10. A system comprising:
-
at least one processor; and a computer-readable storage medium storing instructions thereon that, when executed by the at least one processor, cause the at least one processor to; maintain a database associated with one or more web pages; parse a web page to identify a first portion of the web page and a second portion of the web page; analyze the first portion of the web page to identify a first potential name within the first portion of the web page; analyze the second portion of the web page to identify a second potential name within the second portion of the web page; utilize the first potential name and the second potential name to identify, without user intervention, a name of an owner associated with the web page; and associate within the database, the name of the owner with the web page. - View Dependent Claims (11, 12, 13, 14, 22)
-
-
15. A method comprising:
-
maintaining a URL index comprising a plurality of URLs, wherein one or more URLs in the plurality of URLs are associated with an owner name; retrieving, using at least one processor, a web page associated with a URL from the URL index; parsing the web page to identify a first portion of the web page and a second portion of the web page; analyzing the first portion to identify a first potential name within the first portion of the web page; analyzing the second portion to identify a second potential name within the second portion of the web page; determining, without user intervention, if the first potential name matches the second potential name; if the first potential name matches the second potential name, determining that the first potential name represents an owner name associated with the web page; and if the first potential name matches the second potential name, associating the owner name with the URL within the URL index. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification