SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR AUTOMATIC TOPIC IDENTIFICATION USING A HYPERTEXT CORPUS
First Claim
1. A method for performing workflow process on at least one content document comprising:
- a) receiving, by at least one processor, the at least one document content;
b) organizing, by the at least one processor, the at least one document content;
c) discovering, by the at least one processor, at least one metadata relating to the at least one document content; and
d) taking action, by the at least one processor, on the at least one document content, based on said (a), (b), and (c).
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, and/or computer program product for automatic topic identification using a hypertext corpus may include a) receiving a content document(s); b) identifying or lexically scoring candidate topic(s) in the received content document based on label(s) used in a corpus to link to or relate to the candidate topics; c) evaluating or semantically scoring the candidate topic(s) of the received document based on a relationship between two or more candidate topics in the corpus; and d) weighting candidate topics for relevance based on algorithmic or statistical analysis of links or relationships in the corpus.
108 Citations
19 Claims
-
1. A method for performing workflow process on at least one content document comprising:
-
a) receiving, by at least one processor, the at least one document content; b) organizing, by the at least one processor, the at least one document content; c) discovering, by the at least one processor, at least one metadata relating to the at least one document content; and d) taking action, by the at least one processor, on the at least one document content, based on said (a), (b), and (c). - View Dependent Claims (2, 3, 4)
-
-
5. A method comprising:
-
a) receiving, by at least one processor, at least one content document; b) identifying or lexically scoring, by the at least one processor, at least one candidate topic in said at least one received content document based on at least one label used in a corpus to link to or relate to said at least one candidate topic; c) evaluating or semantically scoring, by the at least one processor, the at least one candidate topic of the at least one received document based on at least one relationship between at least two of said at least one candidate topics in the corpus; and d) weighting, by the at least one processor, said at least one candidate topics for relevance based on algorithmic analysis or statistical analysis of the at least one hypertext link in the corpus. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
at least one memory; and at least one processor coupled to said at least one memory, said at least one processor adapted to; receive at least one content document; identify or lexically score at least one candidate topic in said at least one received content document based on at least one label used in a corpus to link to or relate to said at least one candidate topic; evaluate or semantically score the at least one candidate topic of the at least one received document based on at least one relationship between at least two of said at least one candidate topics in the corpus; and weight said at least one candidate topics for relevance based on algorithmic analysis or statistical analysis of the at least one hypertext link in the corpus.
-
-
19. A nontransitory computer program product embodied on a computer readable medium, said computer program product comprising program logic, which when executed on at least one computer processor performs a method comprising:
-
a) receiving, by at least one processor, at least one content document; b) identifying or lexically scoring, by the at least one processor, at least one candidate topic in said at least one received content document based on at least one label used in a corpus to link to or relate to said at least one candidate topic; c) evaluating or semantically scoring, by the at least one processor, the at least one candidate topic of the at least one received document based on at least one relationship between at least two of said at least one candidate topics in the corpus; and d) weighting, by the at least one processor, said at least one candidate topics for relevance based on algorithmic analysis or statistical analysis of the at least one hypertext link in the corpus.
-
Specification