SYSTEM AND METHOD FOR IMPROVING WEBPAGE INDEXING AND OPTIMIZATION
First Claim
1. A computer-implemented page request normalization method, comprising:
- responsive to receipt of a page request from a requesting entity, modifying, by a computer processor, the request by at least one of (a) removing one or more of duplicative parameters included in the request, and (b) changing an order of parameters of the request; and
returning, by the processor and to the requesting entity, the modified request as a page redirect.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method may include a processor that normalizes dynamic URLs by sorting URL parameters and removing duplicative URL parameters. The processor may additionally or alternatively provide redirects from one URL to another, where the two URLs are associated with duplicative content. The processor may additionally or alternatively insert a canonical tag into content associated with a URL, where the canonical tag points to another URL whose content is a near duplicate of the content associated with the first URL. The processor may additionally or alternatively apply transformation rules to content of a webpage based on the matching of portions of the URL of the webpage to various character strings.
45 Citations
28 Claims
-
1. A computer-implemented page request normalization method, comprising:
-
responsive to receipt of a page request from a requesting entity, modifying, by a computer processor, the request by at least one of (a) removing one or more of duplicative parameters included in the request, and (b) changing an order of parameters of the request; and returning, by the processor and to the requesting entity, the modified request as a page redirect. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented page request handling method, comprising:
where different ones of a plurality of received webpage requests differ with respect to at least one of (a) a number of included copies of a query parameter and (b) an order of included query parameters, and where each of the plurality of received webpage requests includes at least one copy of each query parameter of each of all others of the plurality of received webpage requests, transmitting, for all of the plurality of received webpage requests, by a computer processor, and to a web server, a respective normalized webpage request, wherein all of the normalized webpage requests include an identical number of query parameters in an identical order.
-
10. A computer-implemented page link normalization method, comprising:
-
responsive to receipt of a webpage addressed to a receiving entity and including a webpage link, modifying, by a computer processor, the webpage by at least one of (a) removing one or more of duplicative parameters included in the link, and (b) changing an order of parameters of the link; and forwarding, by the processor and to the receiving entity, the modified webpage.
-
-
11. A computer-implemented method for duplicate content connection, comprising:
-
comparing, by a computer processor, fingerprints, each associated with a different one of a plurality of page source identifiers; for a subset of the plurality of page source identifiers for which it is determined in the comparing that the fingerprints of the subset are identical, recording, by the processor, a selection of one of the page source identifiers of the subset as authoritative; and responsive to a page request using one of the subset of page source identifiers other than the one selected as authoritative, returning a page redirect with the page source identifier selected as authoritative. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-implemented method for near-duplicate content correction, comprising:
-
determining, by a computer processor, that content associated with a subset of a plurality of page source identifiers is similar; recording, by the processor, a selection of one of the page source identifiers of the subset as authoritative; and providing, by the processor, a canonical tag to the authoritative page source identifier to each of the other page source identifiers of the subset. - View Dependent Claims (22, 23, 24)
-
-
25. A computer-implemented page link optimization method, comprising:
-
responsive to receipt of a webpage addressed to a receiving entity and including a first webpage link; determining, by a computer processor, that the first webpage link is part of a group of webpage links for which a second webpage link is recorded as being authoritative; in accordance with the determination, modifying, by the processor, the webpage by replacing the first webpage link with the second webpage link; and forwarding, by the processor and to the receiving entity, the modified webpage; wherein the webpage links of the group are included in the group in response to a determination that content associated with the webpage links of the group are duplicative.
-
-
26. A computer-implemented page optimization method, comprising:
-
determining, by a computer processor, that a page source identifier includes one or more of a plurality of character strings that are each associated with a respective transformation rule set; and in accordance with the determination, modifying, by the processor, content of a page identified by the page source identifier by application of each of the respective one or more transformation rule sets. - View Dependent Claims (27, 28)
-
Specification