Systems and methods for analyzing boilerplate
First Claim
Patent Images
1. A computer-implemented method comprising:
- identifying a common element in a plurality of articles comprising boilerplate elements and content elements;
determining that the common element is a boilerplate element of the plurality of articles by analyzing a link associated with the common element in an article of the plurality of articles, wherein analyzing the link comprises analyzing an address to which the link refers, and determining that the common element is a boilerplate element based at least in part on the link;
assigning a weight to the common element based at least in part on the determination that the common element is a boilerplate element, wherein a content element is given a higher weight than the boilerplate element;
establishing a search result set including one or more of the plurality of articles in response to a search query; and
ranking the articles included in the search result set responsive at least in part to the weight assigned to the common element.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for analyzing boilerplate are described. In one described system, an indexer identifies a common element in a plurality of related articles. The indexer then classifies the common element as boilerplate. For example, the indexer may identify a copyright notice appearing in a plurality of related articles. The copyright notice in these articles is considered boilerplate.
7 Citations
12 Claims
-
1. A computer-implemented method comprising:
-
identifying a common element in a plurality of articles comprising boilerplate elements and content elements; determining that the common element is a boilerplate element of the plurality of articles by analyzing a link associated with the common element in an article of the plurality of articles, wherein analyzing the link comprises analyzing an address to which the link refers, and determining that the common element is a boilerplate element based at least in part on the link; assigning a weight to the common element based at least in part on the determination that the common element is a boilerplate element, wherein a content element is given a higher weight than the boilerplate element; establishing a search result set including one or more of the plurality of articles in response to a search query; and ranking the articles included in the search result set responsive at least in part to the weight assigned to the common element. - View Dependent Claims (2, 3, 4)
-
-
5. A non-transitory computer-readable storage medium storing executable program code comprising code for:
-
identifying a common element in a plurality of articles comprising boilerplate elements and content elements; determining that the common element is a boilerplate element of the plurality of articles by analyzing a link associated with the common element in an article of the plurality of articles, wherein analyzing the link comprises analyzing an address to which the link refers, and determining that the common element is a boilerplate element based at least in part on the link; assigning a weight to the common element based at least in part on the determination that the common element is a boilerplate element, wherein a content element is given a higher weight than the boilerplate element; establishing a search result set including one or more of the plurality of articles in response to a search query; and ranking the articles included in the search result set responsive at least in part to the weight assigned to the common element. - View Dependent Claims (6, 7, 8)
-
-
9. A computer system comprising:
-
a non-transitory computer-readable storage medium storing executable program code comprising code for; identifying a common element in a plurality of articles comprising boilerplate elements and content elements; determining that the common element is a boilerplate element of the plurality of articles at least by analyzing a link associated with the common element in an article of the plurality of articles, wherein analyzing the link comprises analyzing an address to which the link refers, and determining that the common element is a boilerplate element based at least in part on the link; assigning a weight to the common element based at least in part on the determination that the common element is a boilerplate element, wherein a content element is given a higher weight than the boilerplate element; establishing a search result set including one or more of the plurality of articles in response to a search query; and ranking the articles included in the search result set responsive at least in part to the weight assigned to the common element; and a processor for executing the program code. - View Dependent Claims (10, 11, 12)
-
Specification