GENERATING RESOURCES FOR SUPPORT OF ONLINE SERVICES
First Claim
Patent Images
1. A machine-implemented method for analyzing Wikipedia concepts and Wikipedia categories, comprising:
- counting for each Wikipedia category, a number of first ones of the Wikipedia concepts for which the Wikipedia category is a first-level Wikipedia category that directly includes the first Wikipedia concepts, a number of second ones of the Wikipedia concepts for which the Wikipedia category includes the second Wikipedia concepts only through the second Wikipedia concepts being members of other ones of the Wikipedia categories that in turn include the second Wikipedia concepts and so on up to the number of nth ones of the Wikipedia concepts for which the Wikipedia category is an nth-level Wikipedia category, n being a plural positive integer;
for each Wikipedia category, determining which of the n levels has a highest count and classifying the Wikipedia category into the level having the highest count; and
for each level, determining which Wikipedia categories classified into the level have the most significant ones of the concepts based at least upon a page rank of the Wikipedia category'"'"'s concepts to determine a set of the classified Wikipedia categories for each level having the most significant concepts.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method is provided to analyze a database of concepts organized into categories, wherein each concept is an online textual document, to determine a numerical relationship between the concepts and to determine a hierarchy for the categories.
6 Citations
14 Claims
-
1. A machine-implemented method for analyzing Wikipedia concepts and Wikipedia categories, comprising:
-
counting for each Wikipedia category, a number of first ones of the Wikipedia concepts for which the Wikipedia category is a first-level Wikipedia category that directly includes the first Wikipedia concepts, a number of second ones of the Wikipedia concepts for which the Wikipedia category includes the second Wikipedia concepts only through the second Wikipedia concepts being members of other ones of the Wikipedia categories that in turn include the second Wikipedia concepts and so on up to the number of nth ones of the Wikipedia concepts for which the Wikipedia category is an nth-level Wikipedia category, n being a plural positive integer; for each Wikipedia category, determining which of the n levels has a highest count and classifying the Wikipedia category into the level having the highest count; and for each level, determining which Wikipedia categories classified into the level have the most significant ones of the concepts based at least upon a page rank of the Wikipedia category'"'"'s concepts to determine a set of the classified Wikipedia categories for each level having the most significant concepts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a parser module configured to parse Wikipedia concepts to identify for each Wikipedia concept, all other Wikipedia concepts that the Wikipedia concept hyperlinks to so as to generate a concept reference map listing all referenced Wikipedia concepts for each Wikipedia concept; a disambiguation page extractor module configured to identify all disambiguation pages in Wikipedia that list Wikipedia concepts that are phrased the same but correspond to different textual pages; and a disambiguation module configured to filter the map of referenced Wikipedia concepts to remove disambiguation pages to form a filtered Wikipedia concept reference map; and a similarity computation module configured to process the filtered Wikipedia concept reference map to identify, for each Wikipedia concept, a list of similarity-weighted Wikipedia concepts based at least on an intersection between referenced Wikipedia concepts for each Wikipedia concept and referenced Wikipedia concepts for the similarity-weighted Wikipedia concepts.
-
-
14. The system of claim 18, wherein the system is further configured to process an input from a user to identify a set of Wikipedia concepts related to the input and to further process the set of Wikipedia concepts with regard to the list of similarity-weighted Wikipedia concepts to identify a set of related Wikipedia concepts to the set of Wikipedia concepts.
Specification