SYSTEM, METHOD, AND APPARATUS FOR MULTIDIMENSIONAL EXPLORATION OF CONTENT ITEMS IN A CONTENT STORE
First Claim
1. A computer-implemented method for accessing content items in a content store comprising:
- maintaining a text index of content items in a content store to enable a keyword search on the content items;
receiving a query having a keyword and generating a hit list from the text index using the keyword, the hit list comprising one or more content items of the content store;
extracting frequent phrases from text within content items of the hit list;
assigning a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and
grouping content items into topics based on presence of relevant phrases within the content items of the hit list.
4 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.
-
Citations
20 Claims
-
1. A computer-implemented method for accessing content items in a content store comprising:
-
maintaining a text index of content items in a content store to enable a keyword search on the content items; receiving a query having a keyword and generating a hit list from the text index using the keyword, the hit list comprising one or more content items of the content store; extracting frequent phrases from text within content items of the hit list; assigning a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer program product comprising a computer useable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for finding top-k sets in a collection of static sets P, wherein the top-k sets have maximal overlap with an input dynamic set H, the operations comprising:
-
during a preprocessing state independent of the input dynamic set H, randomizing items L in each static set P[i] in a collection of static sets P, wherein the randomizing uses a technique comprising; hashing each item L[i] into a random number in a [0,1] domain to obtain hashed values; and sorting the hashed values in increasing order; at query time for a given dynamic set H, using the technique to randomize an input dynamic set H; for each static set P[i], estimating an intersection size of an intersection of the static set P[i] and the input dynamic set H; and maintaining a priority queue of top-k static sets having largest intersection sizes. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A system comprising:
-
a content management system (CMS) comprising; a plurality of content items; a content store to store content items; a text index of text within the content items; and an exploration server coupled to the content store and the text index and configured to; generate a hit list from the text index using the keyword, the hit list comprising one or more content items of the content store; extract frequent phrases from text within content items of the hit list; assign a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and group content items into topics based on presence of relevant phrases within the content items of the hit list; and a multidimensional schema manager to manage a multidimensional schema comprising a schema of a fact table and schemata of static dimensions, wherein dynamic dimensions of the multidimensional schema are populated in response to the exploration server returning the hit list, and wherein dynamic dimensions of the multidimensional schema are identified in response to the exploration server returning the hit list based upon content dynamically extracted from the subset of content items identified in the hit list.
-
Specification