GENERATING RESOURCES FOR SUPPORT OF ONLINE SERVICES

US 20160026723A1
Filed: 09/30/2015
Published: 01/28/2016
Est. Priority Date: 11/27/2013
Status: Active Grant

First Claim

Patent Images

1. A machine-implemented method for analyzing Wikipedia concepts and Wikipedia categories, comprising:

counting for each Wikipedia category, a number of first ones of the Wikipedia concepts for which the Wikipedia category is a first-level Wikipedia category that directly includes the first Wikipedia concepts, a number of second ones of the Wikipedia concepts for which the Wikipedia category includes the second Wikipedia concepts only through the second Wikipedia concepts being members of other ones of the Wikipedia categories that in turn include the second Wikipedia concepts and so on up to the number of nth ones of the Wikipedia concepts for which the Wikipedia category is an nth-level Wikipedia category, n being a plural positive integer;

for each Wikipedia category, determining which of the n levels has a highest count and classifying the Wikipedia category into the level having the highest count; and

for each level, determining which Wikipedia categories classified into the level have the most significant ones of the concepts based at least upon a page rank of the Wikipedia category'"'"'s concepts to determine a set of the classified Wikipedia categories for each level having the most significant concepts.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method is provided to analyze a database of concepts organized into categories, wherein each concept is an online textual document, to determine a numerical relationship between the concepts and to determine a hierarchy for the categories.

6 Citations

View as Search Results

14 Claims

1. A machine-implemented method for analyzing Wikipedia concepts and Wikipedia categories, comprising:
- counting for each Wikipedia category, a number of first ones of the Wikipedia concepts for which the Wikipedia category is a first-level Wikipedia category that directly includes the first Wikipedia concepts, a number of second ones of the Wikipedia concepts for which the Wikipedia category includes the second Wikipedia concepts only through the second Wikipedia concepts being members of other ones of the Wikipedia categories that in turn include the second Wikipedia concepts and so on up to the number of nth ones of the Wikipedia concepts for which the Wikipedia category is an nth-level Wikipedia category, n being a plural positive integer;
  
  for each Wikipedia category, determining which of the n levels has a highest count and classifying the Wikipedia category into the level having the highest count; and
  
  for each level, determining which Wikipedia categories classified into the level have the most significant ones of the concepts based at least upon a page rank of the Wikipedia category'"'"'s concepts to determine a set of the classified Wikipedia categories for each level having the most significant concepts.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The machine-implemented method of claim 1, wherein a subset of the Wikipedia categories have a cyclic organization, the method further comprising breaking the cyclic organization for the subset of Wikipedia categories prior to classifying the Wikipedia categories.
  - 3. The machine-implemented method of claim 1, further comprising,for each Wikipedia concept, identifying all other Wikipedia concepts that the Wikipedia concept hyperlinks to so as to generate a map of referenced Wikipedia concepts for each Wikipedia concept.
  - 4. The machine-implemented method of claim 3, further comprising:
    - receiving an input from a user;
      
      analyzing the input to identify a first set of Wikipedia concepts suggested by the input; and
      
      referring the first set of Wikipedia concepts through the map of referenced Wikipedia concepts to identify additional Wikipedia concepts related to the first set of Wikipedia concepts.
  - 5. The machine-implemented method of claim 1, further comprising:
    - receiving an input from a user;
      
      analyzing the input to identify a first set of Wikipedia concepts suggested by the input; and
      
      reducing the first set of Wikipedia concepts according to whether each Wikipedia concept in the first set exceeds a threshold page rank concept to form a reduced set of Wikipedia concepts having significant page ranks.
  - 6. The machine-implemented method of claim 7, further comprising:
    - analyzing use statistics to identify a usage popularity for each Wikipedia concept, wherein reducing the first set of Wikipedia concepts to form the reduced set of Wikipedia concepts further comprises applying a usage popularity threshold to the first set of Wikipedia concepts such that only Wikipedia concepts in the first set that have a usage popularity exceeding the usage popularity threshold can belong to the reduced set of Wikipedia concepts.
  - 7. The machine-implemented method of claim 2, further comprising:
    - forming a string from each Wikipedia category by using all non-capitalized letters; and
      
      comparing the strings from all the Wikipedia categories to identify those Wikipedia categories having identical strings; and
      
      merging any Wikipedia categories having identical strings into a single corresponding merged Wikipedia category.
  - 8. The machine-implemented method of claim 1, further comprising:
    - forming an inverted index from each Wikipedia concept.
  - 9. The machine-implemented method of claim 10, further comprising:
    - receiving a textual input from a user; and
      
      comparing words in the textual input to the inverted index to identify a set of related Wikipedia concepts to the textual input.
  - 10. The machine-implemented method of claim 11, further comprisingcomparing the related Wikipedia concepts to the set of classified Wikipedia categories for each level having the most significant Wikipedia concepts to identify a reduced set of classified Wikipedia categories including the related Wikipedia concepts.
  - 11. The machine-implemented method of claim 12, further comprising using the set of related Wikipedia concepts and the reduced set of classified Wikipedia categories to suggest content to the user that is related to their textual input.
  - 12. The machine-implemented method of claim 3, further comprising:
    - analyzing the map of referenced Wikipedia concepts to identify any intersection between the referenced Wikipedia concepts from each Wikipedia concept to the referenced Wikipedia concepts to all the remaining Wikipedia concepts to determine a similarity-weighted concept relationship between all the Wikipedia concepts.

13. A system comprising:
- a parser module configured to parse Wikipedia concepts to identify for each Wikipedia concept, all other Wikipedia concepts that the Wikipedia concept hyperlinks to so as to generate a concept reference map listing all referenced Wikipedia concepts for each Wikipedia concept;
  
  a disambiguation page extractor module configured to identify all disambiguation pages in Wikipedia that list Wikipedia concepts that are phrased the same but correspond to different textual pages; and
  
  a disambiguation module configured to filter the map of referenced Wikipedia concepts to remove disambiguation pages to form a filtered Wikipedia concept reference map; and
  
  a similarity computation module configured to process the filtered Wikipedia concept reference map to identify, for each Wikipedia concept, a list of similarity-weighted Wikipedia concepts based at least on an intersection between referenced Wikipedia concepts for each Wikipedia concept and referenced Wikipedia concepts for the similarity-weighted Wikipedia concepts.

14. The system of claim 18, wherein the system is further configured to process an input from a user to identify a set of Wikipedia concepts related to the input and to further process the set of Wikipedia concepts with regard to the list of similarity-weighted Wikipedia concepts to identify a set of related Wikipedia concepts to the set of Wikipedia concepts.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NTT Docomo Incorporated (Nippon Telegraph and Telephone Corporation)
Original Assignee
NTT Docomo Incorporated (Nippon Telegraph and Telephone Corporation)
Inventors
Subasic, Pero, Shin, Hyung Sik, Sujithan, Ronald, Yin, Hongfeng, Mukherjee, Sayandev, Akinaga, Yoshikazu

Granted Patent

US 9,646,099 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 16/22   Indexing; Data structures t...

G06F 16/285   Clustering or classification

G06F 16/335   Filtering based on addition...

G06F 16/355   Class or cluster creation o...

G06F 16/36   Creation of semantic tools,...

G06F 16/954   Navigation, e.g. using cate...

GENERATING RESOURCES FOR SUPPORT OF ONLINE SERVICES

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

6 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

GENERATING RESOURCES FOR SUPPORT OF ONLINE SERVICES

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

6 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links