System and method for determining concepts in a content item using context
First Claim
Patent Images
1. A method for indexing one or more items of content, the method comprising:
- extracting one or more items of text from a given item of content;
tokenizing the one or more items of extracted text into one or more concepts;
identifying one or more related concepts associated with the one or more concepts;
calculating a support score for the one or more concepts; and
indexing the item of content with the one or more concepts and the one or more associated support scores.
9 Assignments
0 Petitions
Accused Products
Abstract
The present invention is directed towards systems and methods for indexing one or more items of content. The method of the present invention comprises extracting one or more items of text from a given item of content. The one or more items of extracted text are tokenized into one or more concepts. One or more related concepts associated with the one or more concepts are identified. A support score is generated for the one or more concepts, and the item of content is index with the one or more concepts and the one or more associated support scores.
77 Citations
47 Claims
-
1. A method for indexing one or more items of content, the method comprising:
-
extracting one or more items of text from a given item of content; tokenizing the one or more items of extracted text into one or more concepts; identifying one or more related concepts associated with the one or more concepts; calculating a support score for the one or more concepts; and indexing the item of content with the one or more concepts and the one or more associated support scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A system for indexing one or more items of content, the system comprising:
-
a text extractor operative to extract one or more items of text from an item of content; a concept dictionary operative to maintain one or more concepts; a context dictionary operative to maintain one or more related concepts associated with the one or more concepts maintained in the concept dictionary; and an aboutness extractor operative to; tokenize the one or more items of text extracted from the item of content according to the one or more concepts maintained in the concept dictionary; identify one or more related concepts associated with the one or more concepts in the item of content; generate support scores for the one or more concepts associated with the item of content; and index the item of content, the one or more concepts associated with the item of content, and the one or more corresponding support scores. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
-
Specification