Method and system for ranking words and concepts in a text using graph-based ranking
First Claim
Patent Images
1. A method of identifying a characteristic of interest represented by a textual input, comprising:
- building a graph with nodes and links corresponding to the textual input, a pair of nodes and a link between the nodes comprising a tuple;
scoring sub-graph components of the graph by assigning a score to each node and each tuple in the graph, the score for each tuple being based on a score of an initial node in the tuple, scores for nodes linking to a target node in the tuple, and a frequency of the tuple in the textual input;
identifying graph fragments of interest based on the scores; and
performing text manipulation based on the identified graph fragments.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a method and system for identifying words, text fragments, or concepts of interest in a corpus of text. A graph is built which covers the corpus of text. The graph includes nodes and links, where nodes represent a word or a concept and links between the nodes represent directed relation names. A score is then computed for each node in the graph. Scores can also be computed for larger sub-graph portions of the graph (such as tuples) The scores are used to identify desired sub-graph portions of the graph, those sub-graph portions being referred to as graph fragments.
33 Citations
25 Claims
-
1. A method of identifying a characteristic of interest represented by a textual input, comprising:
-
building a graph with nodes and links corresponding to the textual input, a pair of nodes and a link between the nodes comprising a tuple; scoring sub-graph components of the graph by assigning a score to each node and each tuple in the graph, the score for each tuple being based on a score of an initial node in the tuple, scores for nodes linking to a target node in the tuple, and a frequency of the tuple in the textual input; identifying graph fragments of interest based on the scores; and performing text manipulation based on the identified graph fragments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method of identifying a characteristic of interest comprising one of words, text fragments, concepts, events, entities and topics, said characteristic of interest represented by a textual input, said method comprising:
-
building a graph comprising nodes linked by links corresponding to the textual input; scoring sub-graph components of the graph; identifying graph fragments of interest based on the scores; ordering the graph fragments based on factors in addition to the scores, the factors comprising at least one of placement of nodes and an order in which two nodes related through part-of-speech will occur, an event timeline determined from the textual input, and a topic determined for the textual input; and performing text manipulation based on the identified graph fragments.
-
Specification