System and method for synchronized web scraping

US 9,836,775 B2
Filed: 05/24/2013
Issued: 12/05/2017
Est. Priority Date: 05/24/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

obtaining, by at least one processing device, information associated with a product, service, or event from each of two or more web pages associated with websites that list the product, service, or event;

determining, by the at least one processing device, that at least some of the information associated with the product, service, or event has changed at at least one of the two or more web pages;

in response to the determining that the at least some information associated with the product, service, or event has changed at the at least one of the two or more web pages, performing synchronized scraping, by the at least one processing device, based on the obtained information, the synchronized scraping performed concurrently from the two or more web pages to obtain scraped data of the same type for the same product, service, or event from each corresponding web page at a same time;

producing, by the at least one processing device, a comparison result based on a comparison of the scraped data for the same product, service, or event from each corresponding web page; and

presenting the comparison result on a graphical user interface.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method includes obtaining information associated with a product, service, or event. The method also includes scraping data based on the obtained information substantially concurrently from two or more web pages associated with websites that list a same product, service, or event to produce scraped data for the same product, service, or event from each corresponding web page at substantially a same time.

7 Citations

View as Search Results

20 Claims

1. A method comprising:
- obtaining, by at least one processing device, information associated with a product, service, or event from each of two or more web pages associated with websites that list the product, service, or event;
  
  determining, by the at least one processing device, that at least some of the information associated with the product, service, or event has changed at at least one of the two or more web pages;
  
  in response to the determining that the at least some information associated with the product, service, or event has changed at the at least one of the two or more web pages, performing synchronized scraping, by the at least one processing device, based on the obtained information, the synchronized scraping performed concurrently from the two or more web pages to obtain scraped data of the same type for the same product, service, or event from each corresponding web page at a same time;
  
  producing, by the at least one processing device, a comparison result based on a comparison of the scraped data for the same product, service, or event from each corresponding web page; and
  
  presenting the comparison result on a graphical user interface.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein obtaining information includes obtaining a uniform resource locator (URL) associated with the product, service, or event.
  - 3. The method of claim 1, further comprising categorizing the obtained information in accordance with one or more identifiers associated with the product, service, or event.
  - 4. The method of claim 3, further comprising comparing the obtained information based on the one or more identifiers to determine whether the websites list the same product, service, or event.
  - 5. The method of claim 3, wherein the one or more identifiers include a book title or an international standard book number (ISBN).
  - 6. The method of claim 3, wherein the one or more identifiers include a price and wherein the scraped data includes real-time price data.
  - 7. The method of claim 1, further comprising receiving search inquiries for the product, service, or event at the graphical user interface.
  - 8. The method of claim 1, wherein the scraped data is received via an internet connection coupled to the at least one processing device.

9. An apparatus comprising:
- at least one processing device configured to;
  
  obtain information associated with a product, service, or event from each of two or more web pages associated with websites that list the product, service, or event;
  
  determine that at least some of the information associated with the product, service, or event has changed at at least one of the two or more web pages;
  
  in response to the determination that the at least some information associated with the product, service, or event has changed at the at least one of the two or more web pages, perform synchronized scraping of data based on the obtained information, the synchronized scraping performed concurrently from the two or more web pages to obtain scraped data of the same type for the same product, service, or event from each corresponding web page at a same time;
  
  produce a comparison result based on a comparison of the scraped data for the same product, service, or event from each corresponding web page; and
  
  present the comparison result on a graphical user interface.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The apparatus of claim 9, wherein obtaining information includes obtaining a uniform resource locator (URL) associated with the product, service, or event.
  - 11. The apparatus of claim 9, wherein the at least one processing device is further configured to categorize the obtained information in accordance with one or more identifiers associated with the product, service, or event.
  - 12. The apparatus of claim 11, wherein the at least one processing device is further configured to compare the obtained information based on the one or more identifiers to determine whether the websites list the same product, service, or event.
  - 13. The apparatus of claim 11, wherein the one or more identifiers include a book title or an international standard book number (ISBN).
  - 14. The apparatus of claim 11, wherein the one or more identifiers include price and wherein the scraped data includes real-time price data.
  - 15. The apparatus of claim 9, wherein the at least one processing device is further configured to:
    - receive search inquiries for the product, service, or event at the graphical user interface.
  - 16. The apparatus of claim 9, wherein the scraped data is received via an internet connection coupled to the at least one processing device.

17. A non-transitory computer readable storage medium comprising instructions that, when executed by at least one processing device, cause the at least one processing device to:
- obtain information associated with a product, service, or event from each of two or more web pages associated with websites that list the product, service, or event;
  
  determine that at least some of the information associated with the product, service, or event has changed at at least one of the two or more web pages;
  
  in response to the determination that the at least some information associated with the product, service, or event has changed at the at least one of the two or more web pages, perform synchronized scraping of data based on the obtained information, the synchronized scraping performed concurrently from the two or more web pages to obtain scraped data of the same type for the same product, service, or event from each corresponding web page at a same time;
  
  produce a comparison result based on a comparison of the scraped data for the same product, service, or event from each corresponding web page; and
  
  present the comparison result on a graphical user interface.
- View Dependent Claims (18, 19, 20)
- - 18. The computer readable storage medium of claim 17, further comprising instructions that, when executed by the at least one processing device, cause the at least one processing device to obtain a uniform resource locator (URL) associated with the product, service, or event.
  - 19. The computer readable storage medium of claim 17, further comprising instructions that, when executed by the at least one processing device, cause the at least one processing device to:
    - receive search inquiries for the product, service, or event at the graphical user interface.
  - 20. The computer readable storage medium of claim 17, wherein the scraped data is received via an internet connection coupled to the at least one processing device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ficstar Software, Inc.
Original Assignee
Ficstar Software, Inc.
Inventors
He, Wei
Primary Examiner(s)
Dunham, Jason
Assistant Examiner(s)
LOHARIKAR, ANAND R

Application Number

US13/901,879
Publication Number

US 20140351091A1
Time in Patent Office

1,656 Days
Field of Search
US Class Current
CPC Class Codes

G06Q 30/0625 Directed, with specific int...

System and method for synchronized web scraping

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

7 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

System and method for synchronized web scraping

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

7 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others