Infrastructure enabling intelligent execution and crawling of a web application
First Claim
1. A method comprising:
- accessing, by one or more computing systems associated with a social-networking system, a structured document of a network application, the structured document comprising structural information and content items, each content item comprising one or more embedded scripts, resources, or identifiers for the resources;
processing, by the one or more computing systems, the structured document to generate a model representation of the structured document;
executing, by the one or more computing systems, a plurality of the content items of the structured document;
generating, by the one or more computing systems, a plurality of snapshots of the model representation of the structured document, each snapshot comprising a respective modified copy of the model representation and corresponding to a respective executed content item of the plurality of content items;
logging, by the one or more computing systems, the plurality of snapshots of the model representation of the structured document;
creating, by the one or more computing systems, a behavior model of the network application based on the plurality of snapshots of the model representation of the structured document, the behavior model representing at least a communication of data between the network application and one or more third-party servers; and
determining, by the one or more computing devices, based on the behavior model, compliance by the network application with one or more requirements of the social-networking system, wherein the determining compliance is further based on whether the network application is passing data received from the social-networking system to the one or more third-party servers.
2 Assignments
0 Petitions
Accused Products
Abstract
In particular embodiments, a method comprises accessing, by one or more computing systems associated with a social-networking system, a structured document of a network application, the structured document comprising structural information and content comprising one or more embedded scripts, resources, or identifiers for the resources. The method further comprises processing the structured document to generate a model representation of the structured document, executing at least some of the content of the structured document and logging multiple snapshots of the model representation of the structured document as the model representation is generated in response to one or more interactions initiated by execution of the content. The method further comprises creating a behavior model of the network application based on the multiple snapshots of the model representation of the structured document and determining, based on the behavior model, compliance by the network application with one or more requirements of the social-networking system.
-
Citations
45 Claims
-
1. A method comprising:
-
accessing, by one or more computing systems associated with a social-networking system, a structured document of a network application, the structured document comprising structural information and content items, each content item comprising one or more embedded scripts, resources, or identifiers for the resources; processing, by the one or more computing systems, the structured document to generate a model representation of the structured document; executing, by the one or more computing systems, a plurality of the content items of the structured document; generating, by the one or more computing systems, a plurality of snapshots of the model representation of the structured document, each snapshot comprising a respective modified copy of the model representation and corresponding to a respective executed content item of the plurality of content items; logging, by the one or more computing systems, the plurality of snapshots of the model representation of the structured document; creating, by the one or more computing systems, a behavior model of the network application based on the plurality of snapshots of the model representation of the structured document, the behavior model representing at least a communication of data between the network application and one or more third-party servers; and determining, by the one or more computing devices, based on the behavior model, compliance by the network application with one or more requirements of the social-networking system, wherein the determining compliance is further based on whether the network application is passing data received from the social-networking system to the one or more third-party servers.
-
-
2. The method of claim 1, wherein the behavior model further comprises one or more URLs or domain names of one or more resources for which requests were sent to or received from during execution of the plurality of content items of the structured document.
-
3. The method of claim 1, wherein the behavior model further comprises one or more URLs or domain names corresponding to one or more advertisement developers or advertisement provider networks to which requests for advertisements were transmitted by the network application or from which one or more incoming responses including advertisements were received during execution of the plurality of content items of the structured document.
-
4. The method of claim 1, further comprising:
-
generating a log that comprises URLs or domain names corresponding to ad networks ascertained after filtering the behavior model; and querying the log against a list of known rogue ad networks.
-
-
5. The method of claim 4, further comprising:
capturing one or more parameters sent to one or more computers associated with the URLs or domain names.
-
6. The method of claim 1, further comprising:
determining, based on the behavior model, how the network application appears at one or more points in a particular user flow.
-
7. The method of claim 1, further comprising:
recording one or more variations in the behavior model of the network application based on characteristics of a logged-in user of the social-networking system.
-
8. The method of claim 7, wherein the characteristics comprise an identity of the logged-in user, demographics of the logged-in user, or profile information from the logged-in user'"'"'s profile in the social-networking system.
-
9. The method of claim 7, wherein the characteristics comprise an identity of the logged-in user, demographics of the logged-in user, or profile information from the logged-in user'"'"'s profile in the social-networking system.
-
10. The method of claim 1, further comprising:
recording one or more variations in the behavior model of the network application based on a geographic location, a browser type, or a type of computing device of the one or more computing systems.
-
11. The method of claim 1, wherein:
-
the one or more computing systems include a primary computing system and one or more secondary computing systems; each of the one or more secondary computing systems hosts one or more crawler processes each operable to access and render the network application; and wherein the method further comprises receiving, by one of the crawler processes executing within one of the secondary computing systems, a request from the primary computing system to access the network application.
-
-
12. The method of claim 11, further comprising accessing, by the one of the crawler processes, one or more servers hosting a canvas web page.
-
13. The method of claim 12, further comprising logging into, by the one of the crawler processes, the one or more servers using test user credentials.
-
14. The method of claim 11, wherein each crawler process is implemented, at least in part, with all or portions of a cross platform component model and a layout engine.
-
15. The method of claim 14, wherein:
-
each crawler process further comprises an overlying programming layer overtop of the cross platform component model and layout engine layers, and the logging a plurality of snapshots of the model representation of the structured document further comprises tracking one or more interactions initiated by executing, by the overlying programming layer, the content items.
-
-
16. The method of claim 1, wherein the model representation is a Document Object Model (DOM) representation.
-
17. The method of claim 1, wherein the communication of data comprises data that is passed by the one or more third-party servers to the network application following the execution of embedded executable code within the structured document.
-
18. The method of claim 1, wherein the communication of data comprises data received from the social-networking system that is passed by the network application to the one or more third-party servers.
-
19. The method of claim 1, wherein the communication of data comprises data that is a URL redirect or frame redirect.
-
20. The method of claim 1, wherein the communication of data comprises one or more of (1) data received by the network application from the social-networking system and passed to the one or more third party servers, or (2) data produced by executing at least some of the content items of the structured document.
-
21. The method of claim 1, wherein each snapshot further comprises interaction data related to the structured document, the interaction data comprising outgoing requests and incoming responses generated at a particular time as the content items of the structured document are executed, and wherein the behavior model of the network application is further created based on the interaction data.
-
22. The method of claim 1, further comprising enumerating one or more attributes of the structured document, wherein one or more of the enumerated attributes are included in the behavior model.
-
23. The method of claim 22, wherein the determining compliance comprises comparing the one or more attributes to one or more profiles to identify an undesirable application.
-
24. The method of claim 23, wherein the one or more attributes are selected based on a scripted rule set.
-
25. A system comprising:
- one or more processors associated with a social-networking system; and
logic encoded in one or more computer-readable tangible storage media that, when executed by the one or more processors, is operable to;access a structured document of a network application, the structured document comprising structural information and content items, each content item comprising one or more embedded scripts, resources, or identifiers for the resources; process the structured document to generate a model representation of the structured document; execute a plurality of the content items of the structured document; generate a plurality of snapshots of the model representation of the structured document, each snapshot comprising a respective modified copy of the model representation and corresponding to a respective executed content item of the plurality of content items; log the plurality of snapshots of the model representation of the structured document; create a behavior model of the network application based on the plurality of snapshots of the model representation of the structured document, the behavior model representing at least a communication of data between the network application and one or more third-party servers; and determine, based on the behavior model, compliance by the network application with one or more requirements of the social-networking system, wherein the determining compliance is further based on whether the network application is passing data received from the social-networking system to the one or more third-party servers.
- one or more processors associated with a social-networking system; and
-
26. The system of claim 25, wherein the determining compliance comprises comparing one or more attributes to one or more profiles to identify an undesirable application.
-
27. The system of claim 26, wherein the one or more attributes are selected based on a scripted rule set.
-
28. The system of claim 25, wherein the behavior model further comprises one or more URLs or domain names of one or more resources for which requests were sent to or received from during execution of the plurality of content items of the structured document.
-
29. The system of claim 25, wherein the behavior model further comprises one or more URLs or domain names corresponding to one or more advertisement developers or advertisement provider networks to which requests for advertisements were transmitted by the network application or from which one or more incoming responses including advertisements were received during execution of the plurality of content items of the structured document.
-
30. The system of claim 25, wherein the processors are further operable when executing the instructions to:
-
generate a log that comprises URLs or domain names corresponding to ad networks ascertained after filtering the behavior model; and query the log against a list of known rogue ad networks.
-
-
31. The system of claim 30, wherein the processors are further operable when executing the instructions to:
capture one or more parameters sent to one or more computers associated with the URLs or domain names.
-
32. The system of claim 25, wherein the processors are further operable when executing the instructions to:
determine, based on the behavior model, how the network application appears at one or more points in a particular user flow.
-
33. The system of claim 25, wherein the processors are further operable when executing the instructions to:
record one or more variations in the behavior model of the network application based on characteristics of a logged-in user of the social-networking system.
-
34. The system of claim 25, wherein the processors are further operable when executing the instructions to:
record one or more variations in the behavior model of the network application based on a geographic location, a browser type, or a type of computing device of the one or more computing systems.
-
35. The system of claim 25, wherein:
-
the one or more computing systems include a primary computing system and one or more secondary computing systems; each of the one or more secondary computing systems hosts one or more crawler processes each operable to access and render the network application; and wherein the method further comprises receiving, by one of the crawler processes executing within one of the secondary computing systems, a request from the primary computing system to access the network application.
-
-
36. The system of claim 35, wherein the processors are further operable when executing the instructions to:
access, by the one of the crawler processes, one or more servers hosting a canvas web page.
-
37. The system of claim 36, wherein the processors are further operable when executing the instructions to:
log into, by the one of the crawler processes, the one or more servers using test user credentials.
-
38. The system of claim 35, wherein each crawler process is implemented, at least in part, with all or portions of a cross platform component model and a layout engine.
-
39. The system of claim 38, wherein:
-
each crawler process further comprises an overlying programming layer overtop of the cross platform component model and layout engine layers, and the logging a plurality of snapshots of the model representation of the structured document further comprises tracking one or more interactions initiated by executing, by the overlying programming layer, the content items.
-
-
40. The system of claim 25, wherein the model representation is a Document Object Model (DOM) representation.
-
41. The system of claim 25, wherein the communication of data comprises data that is passed by the one or more third-party servers to the network application following the execution of embedded executable code within the structured document.
-
42. The system of claim 25, wherein the communication of data comprises data received from the social-networking system that is passed by the network application to the one or more third-party servers.
-
43. The system of claim 25, wherein the communication of data comprises data that is a URL redirect or frame redirect.
-
44. The system of claim 25, wherein the communication of data comprises one or more of (1) data received by the network application from the social-networking system and passed to the one or more third party servers, or (2) data produced by executing at least some of the content items of the structured document.
-
45. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
access a structured document of a network application, the structured document comprising structural information and content items, each content item comprising one or more embedded scripts, resources, or identifiers for the resources; process the structured document to generate a model representation of the structured document; execute a plurality of the content items of the structured document; generate a plurality of snapshots of the model representation of the structured document, each snapshot comprising a respective modified copy of the model representation and corresponding to a respective executed content item of the plurality of content items; log the plurality of snapshots of the model representation of the structured document; create a behavior model of the network application based on the plurality of snapshots of the model representation of the structured document, the behavior model representing at least a communication of data between the network application and one or more third-party servers; and determine, based on the behavior model, compliance by the network application with one or more requirements of the social-networking system, wherein the determining compliance is further based on whether the network application is passing data received from the social-networking system to the one or more third-party servers.
-
Specification