System and method for identifying the owner of a document on the world-wide web

US 9,075,881 B2
Filed: 09/12/2012
Issued: 07/07/2015
Est. Priority Date: 05/10/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A method comprising:

maintaining a database of root URLs;

utilizing, using at least one processor, a root URL from the database of root URLs to retrieve a web page;

parsing the text of the web page to identify a company name within the text of the web page;

in response to identifying the company name, querying, using the identified company name, a third-party database to obtain information associated with the web page;

utilizing the obtained information to identify a geographic location associated with the web page; and

updating, within the database of root URLs, information associated with the web page to include the geographic location.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and search engine for classifying a source publishing a document on a portion of a network, includes steps of electronically receiving a document, based on the document, determining a source which published the document, and assigning a code to the document based on whether data associated with the document published by the source matches with data contained in a database. An intelligent geographic- and business topic-specific resource discovery system facilitates local commerce on the World-Wide Web and also reduces search time by accurately isolating information for end-users. Distinguishing and classifying business pages on the Web by business categories using Standard Industrial Classification (SIC) codes is achieved through an automatic iterative process.

Citations

29 Claims

1. A method comprising:
- maintaining a database of root URLs;
  
  utilizing, using at least one processor, a root URL from the database of root URLs to retrieve a web page;
  
  parsing the text of the web page to identify a company name within the text of the web page;
  
  in response to identifying the company name, querying, using the identified company name, a third-party database to obtain information associated with the web page;
  
  utilizing the obtained information to identify a geographic location associated with the web page; and
  
  updating, within the database of root URLs, information associated with the web page to include the geographic location.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein parsing the text of the web page to identify the company name comprises analyzing a first portion of the web page to identify the company name.
  - 3. The method of claim 2, wherein the first portion of the web page comprises a title portion of the web page.
  - 4. The method of claim 3, wherein parsing the text of the web page to identify the company name comprises analyzing a second portion of the web page to identify the company name.
  - 5. The method of claim 4, wherein the second portion of the web page includes a copyright notice.
  - 6. The method of claim 1, further comprising associating the geographic location with the web page in a database.
  - 7. The method of claim 6, further comprising receiving a search request comprising a geographic location component.
  - 8. The method of claim 7, further comprising querying the database using the geographic location component.
  - 9. The method of claim 8, further comprising sending a search result referencing the web page associated with the geographic location if the geographic location relates to the geographic location component of the search request.
  - 10. The method of claim 1, wherein the third-party database is a business listing database.
  - 11. The method of claim 1, wherein the obtained information associated with the web page comprises address information.

12. A non-transitory computer-readable storage medium including a set of instructions that, when executed, cause at least one processor to perform steps comprising:
- maintaining a database of root URLs;
  
  utilizing a root URL from the database of root URLs to retrieve a web page;
  
  parsing the text of the web page to identify a company name within the text of the web page;
  
  in response to identifying the company name, utilizing the company name associated with the web page to obtain additional information about the web page;
  
  associating the additional information with the web page; and
  
  updating, within the database of root URLs, to include the information associated with the web page.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. The non-transitory computer-readable storage medium of claim 12, wherein parsing the text of the web page comprises analyzing a first portion of the web page to identify the company name.
  - 14. The non-transitory computer-readable storage medium of claim 13, wherein parsing the text of the web page further comprises analyzing a second portion of the web page to identify the company name.
  - 15. The non-transitory computer-readable storage medium of claim 14, further comprising comparing the analysis of the first portion of the web page with the analysis of the second portion of the web page.
  - 16. The non-transitory computer-readable storage medium of claim 12, wherein utilizing the company name associated with the web page to determine additional information about the web page comprises querying a database using the company name.
  - 17. The non-transitory computer-readable storage medium of claim 16, wherein the database comprises a third-party database.
  - 18. The non-transitory computer-readable storage medium of claim 16, wherein the database comprises a business listing database.
  - 19. The non-transitory computer-readable storage medium of claim 16, wherein utilizing the company name associated with the web page to obtain additional information about the web page further comprises extracting from the database address information associated with the web page.
  - 20. The non-transitory computer-readable storage medium of claim 16, wherein utilizing the company name associated with the web page to obtain additional information about the web page further comprises extracting from the database telephone number information associated with the web page.

21. A system comprising:
- at least one processor; and
  
  a computer-readable storage medium storing instructions thereon that, when executed by the at least one processor, cause the at least one processor to;
  
  maintain a database of root URLs;
  
  utilize a root URL from the database of root URLs to retrieve a web page;
  
  parse the text of the web page to identify a company name within the text of the web page;
  
  in response to identifying the company name, utilize the company name to obtain additional information;
  
  use the additional information to identify geographic location information associated with the company name;
  
  update, within the database of root URLs, information associated with the web page to include the geographic location.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
- - 22. The system of claim 21, wherein utilizing the company name to obtain additional information comprises querying, using the identified company name, a database to obtain the additional information.
  - 23. The system of claim 21, wherein the additional information comprises address information.
  - 24. The system of claim 23, wherein using the additional information to identify the geographic location comprises extracting a city and a state from the address information.
  - 25. The system of claim 23, wherein using the additional information to identify the geographic location comprises extracting a zip code from the address information.
  - 26. The system of claim 21, wherein the additional information comprises telephone number information.
  - 27. The system of claim 26, wherein using the additional information to identify the geographic location comprises extracting an area code from the telephone number.
  - 28. The system of claim 21, further comprising instructions that, when executed by the at least one processor, causes the at least one processor to associate the geographic location information with the root URL in a root URL index.
  - 29. The method of claim 21, further comprising instructions that, when executed by the at least one processor, causes the at least one processor to associate the company name with the root URL in a root URL index.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Original Assignee
Meta Platforms, Inc. (f/k/a Facebook, Inc.)
Inventors
Virdy, Ajaipal Singh
Primary Examiner(s)
Morrison, Jay
Assistant Examiner(s)
ELLIS, MATTHEW J

Application Number

US13/612,390
Publication Number

US 20130173624A1
Time in Patent Office

1,028 Days
Field of Search

707/740, 707/711, 707/706
US Class Current

1/1
CPC Class Codes

G06F 16/2228   Indexing structures

G06F 16/29   Geographical information da...

G06F 16/31   Indexing; Data structures t...

G06F 16/313   Selection or weighting of t...

G06F 16/35   Clustering; Classification

G06F 16/355   Class or cluster creation o...

G06F 16/951   Indexing; Web crawling tech...

G06Q 30/02   Marketing; Price estimation...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99942   Manipulating data structure...

Y10S 707/99943   Generating database or data...

Y10S 707/99944   Object-oriented database st...

Y10S 707/99945   Object-oriented database st...

System and method for identifying the owner of a document on the world-wide web

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for identifying the owner of a document on the world-wide web

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links