SYSTEMS AND METHODS OF UNIVERSAL RESOURCE LOCATOR NORMALIZATION
First Claim
1. A method comprising:
- grouping a plurality of uniform resource locators (URLs) that correspond to a resource, each group having URLs whose resource is determined to correspond and each resource determined to be different between groups;
examining each group of URLs to determine at least one normalization rule for the group based on the URLs in the group, each URL in the group comprising at least one component determinative of the resource represented by the URLs in that group;
examining at least two normalization rules generated from different groups to determine whether the at least two normalization rules can be generalized into one generalized normalization rule for use with the different groups, the generalized normalization rule to be used to normalize URLs corresponding to both same and different resources and generalizes the at least one resource determinative component.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are method, systems and architectures for normalizing identifiers corresponding to resources using normalization rules that can be generalized for use with different resources. By way of a non-limiting example, an identifier can be a uniform resource locator (URL), and a normalization rule can be used to normalize URLs that correspond to different resources, e.g., content. A normalization rule can be generated by generalizing two or more normalization rules corresponding to different resources, such that a content determinative component is generalized. A normalization rule can be defined to include a context portion used to determine the rule'"'"'s applicability to an identifier, and a transformation portion that identifies the transformations to be applied to an applicable identifier to yield a normalized form of the URL. A generalization of two or more normalization rules can include a normalization of one or both of the context and transformation portions.
49 Citations
21 Claims
-
1. A method comprising:
-
grouping a plurality of uniform resource locators (URLs) that correspond to a resource, each group having URLs whose resource is determined to correspond and each resource determined to be different between groups; examining each group of URLs to determine at least one normalization rule for the group based on the URLs in the group, each URL in the group comprising at least one component determinative of the resource represented by the URLs in that group; examining at least two normalization rules generated from different groups to determine whether the at least two normalization rules can be generalized into one generalized normalization rule for use with the different groups, the generalized normalization rule to be used to normalize URLs corresponding to both same and different resources and generalizes the at least one resource determinative component. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-readable medium storing computer-executable program code comprising code to:
-
group a plurality of uniform resource locators (URLs) that correspond to a resource, each group having URLs whose resource is determined to correspond and each resource determined to be different between groups; examine each group of URLs to determine at least one normalization rule for the group based on the URLs in the group, each URL in the group comprising at least one component determinative of the resource represented by the URLs in that group; examine at least two normalization rules generated from different groups to determine whether the at least two normalization rules can be generalized into one generalized normalization rule for use with the different groups, the generalized normalization rule to be used to normalize URLs corresponding to both same and different resources and generalizes the at least one resource determinative component. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus comprising:
one or more processors configured to; group a plurality of uniform resource locators (URLs) that correspond to a resource, each group having URLs whose resource is determined to correspond and each resource determined to be different between groups; examine each group of URLs to determine at least one normalization rule for the group based on the URLs in the group, each URL in the group comprising at least one component determinative of the resource represented by the URLs in that group; examine at least two normalization rules generated from different groups to determine whether the at least two normalization rules can be generalized into one generalized normalization rule for use with the different groups, the generalized normalization rule to be used to normalize URLs corresponding to both same and different resources and generalizes the at least one resource determinative component. - View Dependent Claims (16, 17, 18, 19, 20, 21)
Specification