Identifying transient paths within websites
First Claim
Patent Images
1. A method comprising:
- receiving identification of known transient content within a first web page, the first web page being associated with a website;
identifying a path associated with the known transient content with respect to the first web page;
receiving other web pages associated with the website;
determining whether the path exists in any of the other web pages associated with the web site;
if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content;
for each of the web pages including the path,identifying a transient frequency with which content identified by the path changes over multiple versions of each of the web pages that include path;
comparing the transient frequency with a threshold frequency;
if the transient frequency exceeds the threshold frequency, identifying the path as a transient path;
identifying transient content from the probable transient content based upon the transient path; and
identifying targeted advertisements for web pages associated with the website, wherein content identified as transient content is excluded from consideration in identifying the targeted advertisements.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods and computer readable media for identifying transient paths within websites. Transient paths can be identified, for example, by identifying a path associated with known transient content and determining that the path exists on other pages associated with the website. If the path exists in other web pages associated with the website, the content associated with the path can be identified as transient content.
32 Citations
14 Claims
-
1. A method comprising:
-
receiving identification of known transient content within a first web page, the first web page being associated with a website; identifying a path associated with the known transient content with respect to the first web page; receiving other web pages associated with the website; determining whether the path exists in any of the other web pages associated with the web site; if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content; for each of the web pages including the path, identifying a transient frequency with which content identified by the path changes over multiple versions of each of the web pages that include path; comparing the transient frequency with a threshold frequency; if the transient frequency exceeds the threshold frequency, identifying the path as a transient path; identifying transient content from the probable transient content based upon the transient path; and identifying targeted advertisements for web pages associated with the website, wherein content identified as transient content is excluded from consideration in identifying the targeted advertisements. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
receiving identification of known transient content within a first web page, the first web page being associated with a website; identifying a path associated with the known transient content with respect to the first web page; receiving other web pages associated with the website; determining whether the path exists in any of the other web pages associated with the web site; if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content; identifying a path associated with the known transient content on the first web page; identifying a subtree count comprising a number of times the path appears in other web pages associated with the website; identifying a marked subtree count comprising a number of times content associated with the path changes between versions of the respective web pages in which the path appears; comparing the subtree count with the marked subtree count; and identifying the path as a transient path based upon the comparison. - View Dependent Claims (7)
-
-
8. One or more computer readable media, operable to cause one or more data processing apparatuses to perform operations comprising:
-
receiving identification of known transient content within a first web page, the first web page being associated with a website; identifying a path associated with the known transient content with respect to the first web page; receiving other web pages associated with the website; determining whether the path exists in any of the other web pages associated with the web site; if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content retrieving each of the pages including the path; identifying a transient frequency with which content identified by the path changes over multiple versions of each of the pages that include path; comparing the transient frequency with a threshold frequency; if the transient frequency exceeds the threshold frequency, identifying the path as a transient path; identifying transient content from the probable transient content based upon the transient path; and identifying targeted advertisements for web pages associated with the website, wherein content identified as transient content is excluded from consideration in identifying the targeted advertisements. - View Dependent Claims (9)
-
-
10. One or more computer readable media, operable to cause one or more data processing apparatuses to perform operations comprising:
-
receiving identification of known transient content within a first web page, the first web page being associated with a website; identifying a path associated with the known transient content with respect to the first web page; receiving other web pages associated with the website; determining whether the path exists in any of the other web pages associated with the web site; if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content; identifying a path associated with the known transient content on the web page; identifying a subtree count comprising a number of times the path appears in other web pages associated with the website; identifying a marked subtree count comprising a number of times content associated with the path changes between versions of the respective web pages in which the path appears; comparing the subtree count with the marked subtree count; and identifying the path as a transient path based upon the comparison.
-
-
11. One or more computer readable media, operable to cause one or more data processing apparatuses to perform operations comprising:
-
receiving identification of known transient content within a first web page, the first web page being associated with a website; identifying a path associated with the known transient content with respect to the first web page; receiving other web pages associated with the website; determining whether the path exists in any of the other web pages associated with the web site; if it is determined that the path exists in any of the other web pages associated with the website, identifying the content associated with the path in the other web pages as probable transient content; wherein a path is identified as a transient path when a ratio between the number of changes associated with the path between versions of the respective web pages divided by the number of times the path appears in other web pages associated with the website is greater than a threshold frequency, whereby if the path changes more than a threshold frequency the path is identified as a transient path.
-
-
12. A method comprising:
-
receiving identification of known transient content within a web page, the web page being associated with a web site; identifying a path associated with the known transient content on the web page; identifying a subtree count comprising a number of times the potential transient path appears in other web pages associated with the website; identifying a marked subtree count comprising a number of times content associated with the potential transient path changes between versions of the respective web pages in which the path appears; comparing the subtree count with the marked subtree count; and identifying the path as a transient path based upon the comparison. - View Dependent Claims (13, 14)
-
Specification