System for and method of identifying closely matching textual identifiers, such as domain names
First Claim
1. A computer-implemented method of identifying a set of textual identifiers comprising:
- maintaining a log of requests for a predetermined date range to resolve unresolvable textual identifiers in one or more domain name requests, wherein the log comprises geolocation information for each request in the log;
identifying a set of unique unresolvable textual identifiers based on the log;
determining a number of requests for each unique textual identifier in the log;
creating a first association between each unique unresolvable textual identifier and one or more requests corresponding to each unique unresolvable textual identifier;
creating one or more tokens for each of the unique unresolvable textual identifiers contained within the first association;
creating a second association between the one or more tokens and each unique unresolvable textual identifier for which the one more tokens were created by inverting the first association, wherein the second association comprises the number of the requests and geolocation information corresponding to each request; and
sorting the second association according to the number of the requests associated with each of the one or more tokens,wherein each of the unique unresolvable textual identifiers comprises a sequence of symbols, wherein each of the one or more tokens comprises an n-gram, and wherein the n-gram is a subsequence of n items from the sequence of symbols forming each of the unique unresolvable textual identifiers.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and systems provide tracking or logging requests to resolve non-existent domain (NXDomains) and organizing the NXDomains to support searching of the domain names including ranking the NXDomains based on popularity, e.g, number of hits or potential traffic based on the number of requests made for the NXDomain. NXDomain logs may be organized so that it supports searching by creating an inverted index including n-grams of the NXDomains. Searching includes identifying a target substring in one or more of the indexes, selecting those matching NXDomains satisfying some threshold criteria, and displaying the NXDomains in a selected order such as by demand or popularity associated with, for example, a selected geographical location from which resolution requests targeting respective NXDomains originate.
14 Citations
10 Claims
-
1. A computer-implemented method of identifying a set of textual identifiers comprising:
-
maintaining a log of requests for a predetermined date range to resolve unresolvable textual identifiers in one or more domain name requests, wherein the log comprises geolocation information for each request in the log; identifying a set of unique unresolvable textual identifiers based on the log; determining a number of requests for each unique textual identifier in the log; creating a first association between each unique unresolvable textual identifier and one or more requests corresponding to each unique unresolvable textual identifier; creating one or more tokens for each of the unique unresolvable textual identifiers contained within the first association; creating a second association between the one or more tokens and each unique unresolvable textual identifier for which the one more tokens were created by inverting the first association, wherein the second association comprises the number of the requests and geolocation information corresponding to each request; and sorting the second association according to the number of the requests associated with each of the one or more tokens, wherein each of the unique unresolvable textual identifiers comprises a sequence of symbols, wherein each of the one or more tokens comprises an n-gram, and wherein the n-gram is a subsequence of n items from the sequence of symbols forming each of the unique unresolvable textual identifiers. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A data processing system for identifying a set of textual identifiers, the data processing system comprising a storage device including a storage medium, wherein the storage device stores computer usable program code;
- and a processor, wherein the processor executes the computer usable program code, and wherein the computer usable program code is operable to perform a method comprising;
maintaining a log of requests for a predetermined date range to resolve unresolvable textual identifiers in one or more domain name requests, wherein the log comprises geolocation information for each request in the log; identifying a unique identifier set of unique unresolvable textual identifiers based on the log; determining a number of requests for each unique textual identifier in the log; creating a first association between each unique unresolvable textual identifier and one or more requests corresponding to each unique unresolvable textual identifier; creating one or more tokens for each of the unique unresolvable textual identifiers contained within the first association; creating a second association between the one or more tokens and each unique unresolvable textual identifier for which the one more tokens were created by inverting the first association, wherein the second association comprises the number of the requests and geolocation information corresponding to each request; and sorting the second association according to the number of the requests associated with each of the one or more tokens, wherein each of the unique unresolvable textual identifiers comprises a sequence of symbols, wherein each of the one or more tokens comprises an n-gram, and wherein the n-gram is a subsequence of n items from the sequence of symbols forming each of the unique unresolvable textual identifiers. - View Dependent Claims (9, 10)
- and a processor, wherein the processor executes the computer usable program code, and wherein the computer usable program code is operable to perform a method comprising;
Specification