UNIFORM RESOURCE IDENTIFIER ALIGNMENT
First Claim
Patent Images
1. A method, comprising:
- segmenting a plurality of uniform resource identifiers into one or more tokens to produce one or more sequences; and
analyzing the one or more sequences using a multiple sequence alignment process to produce a plurality of aligned sequence sets corresponding to the plurality of uniform resource identifiers.
3 Assignments
0 Petitions
Accused Products
Abstract
Subject matter disclosed herein may relate to alignment of uniform resource identifiers associated with web pages, and further may relate to multiple sequence alignment of uniform resource identifiers. In one or more example embodiments, multiple sequence alignment techniques may provide improved tokenization of uniform resource identifiers associated with web pages, which may provide improved performance of applications such as, for example, uniform resource identifier normalization, sitemap construction, etc.
48 Citations
33 Claims
-
1. A method, comprising:
-
segmenting a plurality of uniform resource identifiers into one or more tokens to produce one or more sequences; and analyzing the one or more sequences using a multiple sequence alignment process to produce a plurality of aligned sequence sets corresponding to the plurality of uniform resource identifiers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An article, comprising:
- a storage medium having stored thereon instructions that, if executed, direct a computing platform to;
segment a plurality of uniform resource identifiers into one or more tokens to produce one or more sequences; and analyze the one or more sequences using a multiple sequence alignment process to produce a plurality of aligned sequence sets corresponding to the plurality of uniform resource identifiers. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- a storage medium having stored thereon instructions that, if executed, direct a computing platform to;
-
23. An apparatus, comprising:
-
means for segmenting a plurality of uniform resource identifiers into one or more tokens to produce one or more sequences; and means for analyzing the one or more sequences using a multiple sequence alignment process to produce a plurality of aligned sequence sets corresponding to the plurality of uniform resource identifiers. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
Specification