Identifying related concepts of URLs and domain names
First Claim
1. A computer implemented method comprising:
- using structural parsing to extract information from user input comprising a URL or domain name, the information comprising one or more of a protocol, a location, and a subdirectory;
using semantic parsing of the information to identify a first one or more concepts represented by one or more tokens within the extracted information;
determining whether the domain name can be mapped to one or more concepts in the concept association map by switching term positions or changing numbers;
when the domain name can be mapped and if the mapped concepts have high score, identifying the concepts as seed concepts for further querying the concept association map;
when the mapped concepts do not have a high enough score, or the domain name cannot be mapped, then determining whether the input domain name can be mapped to a concept in the concept association map by typographical error correction, the correction comprising one or more of insertion, deletion, and switching or replacement of 1 or 2 characters; and
when the input domain name cannot be mapped by typographical error correction, or if concepts mapped as a result of typographical error correction do not have a high score, determining how to break the domain name into URL tokens by inserting separators at correction positions and correcting the tokens;
querying a concept association map to retrieve a second one or more concepts related to the first one or more concepts, each of the concepts representing a unit of thought, expressed by a term, letter, or symbol, the concept association map comprising a representation of concepts, concept metadata, and relationships between the concepts;
ranking the first one or more concepts and the second one or more concepts to create ranked concepts; and
storing the ranked concepts for displaying to one or more users of the computer platform.
7 Assignments
0 Petitions
Accused Products
Abstract
A solution for identifying related concepts of URLs and domain names includes using structural parsing to extract information from user input comprising a URL or domain name. The information includes one or more of a protocol, a location, and a subdirectory. Semantic parsing of the information is used to identify a first one or more concepts represented by one or more tokens within the extracted information. A content association map is queried to retrieve a second one or more concepts related to the first one or more concepts. Each of the concepts represents a unit of thought, expressed by a term, letter, or symbol. The concept association map includes a representation of concepts, concept metadata, and relationships between the concepts. The first one or more concepts and the second one or more concepts are ranked, and the ranked concepts are stored for displaying to one or more users of the computer platform.
-
Citations
21 Claims
-
1. A computer implemented method comprising:
-
using structural parsing to extract information from user input comprising a URL or domain name, the information comprising one or more of a protocol, a location, and a subdirectory; using semantic parsing of the information to identify a first one or more concepts represented by one or more tokens within the extracted information; determining whether the domain name can be mapped to one or more concepts in the concept association map by switching term positions or changing numbers; when the domain name can be mapped and if the mapped concepts have high score, identifying the concepts as seed concepts for further querying the concept association map; when the mapped concepts do not have a high enough score, or the domain name cannot be mapped, then determining whether the input domain name can be mapped to a concept in the concept association map by typographical error correction, the correction comprising one or more of insertion, deletion, and switching or replacement of 1 or 2 characters; and when the input domain name cannot be mapped by typographical error correction, or if concepts mapped as a result of typographical error correction do not have a high score, determining how to break the domain name into URL tokens by inserting separators at correction positions and correcting the tokens; querying a concept association map to retrieve a second one or more concepts related to the first one or more concepts, each of the concepts representing a unit of thought, expressed by a term, letter, or symbol, the concept association map comprising a representation of concepts, concept metadata, and relationships between the concepts; ranking the first one or more concepts and the second one or more concepts to create ranked concepts; and storing the ranked concepts for displaying to one or more users of the computer platform. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus comprising:
-
a memory; and one or more processors configured to; use structural parsing to extract information from user input comprising a URL or domain name, the information comprising one or more of a protocol, a location, and a subdirectory, wherein structural parsing comprises; determining whether the domain name can be mapped to one or more concepts in the concept association map by switching term positions or changing numbers; when the domain name can be mapped and if the mapped concepts have high score, identifying the concepts as seed concepts for further querying the concept association map; when the mapped concepts do not have a high enough score, or the domain name cannot be mapped, then determining whether the input domain name can be mapped to a concept in the concept association map by typographical error correction, the correction comprising one or more of insertion, deletion, and switching or replacement of 1 or 2 characters; and when the input domain name cannot be mapped by typographical error correction, or if concepts mapped as a result of typographical error correction do not have a high score, determining how to break the domain name into URL tokens by inserting separators at correction positions and correcting the tokens; use semantic parsing of the information to identify a first one or more concepts represented by one or more tokens within the extracted information; query a concept association map to retrieve a second one or more concepts related to the first one or more concepts, each of the concepts representing a unit of thought, expressed by a term, letter, or symbol, the concept association map comprising a representation of concepts, concept metadata, and relationships between the concepts; rank the first one or more concepts and the second one or more concepts to create ranked concepts; and store the ranked concepts for displaying to one or more users of the computer platform. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, the method comprising:
-
using structural parsing to extract information from user input comprising a URL or domain name, the information comprising one or more of a protocol, a location, and a subdirectory, wherein structural parsing comprises; determining whether the domain name can be mapped to one or more concepts in the concept association map by switching term positions or changing numbers; when the domain name can be mapped and if the mapped concepts have high score, identifying the concepts as seed concepts for further querying the concept association map; when the mapped concepts do not have a high enough score, or the domain name cannot be mapped, then determining whether the input domain name can be mapped to a concept in the concept association map by typographical error correction, the correction comprising one or more of insertion, deletion, and switching or replacement of 1 or 2 characters; and when the input domain name cannot be mapped by typographical error correction, or if concepts mapped as a result of typographical error correction do not have a high score, determining how to break the domain name into URL tokens by inserting separators at correction positions and correcting the tokens; using semantic parsing of the information to identify a first one or more concepts represented by one or more tokens within the extracted information; querying a concept association map to retrieve a second one or more concepts related to the first one or more concepts, each of the concepts representing a unit of thought, expressed by a term, letter, or symbol, the concept association map comprising a representation of concepts, concept metadata, and relationships between the concepts; ranking the first one or more concepts and the second one or more concepts to create ranked concepts; and storing the ranked concepts for displaying to one or more users of the computer platform.
-
Specification