System, method, and apparatus for multidimensional exploration of content items in a content store
First Claim
1. A computer-implemented method for accessing content items in a content store comprising:
- maintaining a text index of content items in a content store to enable a keyword search on the content items;
receiving a query having a keyword and generating a hit list from the text index using the keyword, the hit list comprising two or more content items of the content store;
extracting frequent phrases from text within content items of the hit list by estimating an intersection size, wherein the estimating comprises executing an algorithm to intersect a first posting list generated from globally frequent phrases with a second posting list generated from the hit list, wherein the algorithm terminates the executing in response to the earlier of identifying a predetermined M maximum number of comparisons or a predetermined I maximum number of common points;
assigning a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and
grouping content items into topics based on presence of relevant phrases within the content items of the hit list.
4 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.
8 Citations
13 Claims
-
1. A computer-implemented method for accessing content items in a content store comprising:
-
maintaining a text index of content items in a content store to enable a keyword search on the content items; receiving a query having a keyword and generating a hit list from the text index using the keyword, the hit list comprising two or more content items of the content store; extracting frequent phrases from text within content items of the hit list by estimating an intersection size, wherein the estimating comprises executing an algorithm to intersect a first posting list generated from globally frequent phrases with a second posting list generated from the hit list, wherein the algorithm terminates the executing in response to the earlier of identifying a predetermined M maximum number of comparisons or a predetermined I maximum number of common points; assigning a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a content management system (CMS) comprising; a plurality of content items; a content store to store content items; a text index of text within the content items; and an exploration server coupled to the content store and the text index and configured to; generate a hit list from the text index using the keyword, the hit list comprising two or more content items of the content store; extract frequent phrases from text within content items of the hit list by estimating an intersection size, wherein the estimating comprises executing an algorithm to intersect a first posting list generated from globally frequent phrases with a second posting list generated from the hit list, wherein the algorithm terminates the executing in response to the earlier of identifying a predetermined M maximum number of comparisons or a predetermined I maximum number of common points; assign a relative relevance to the frequent phrases wherein frequent phrases having a relatively high relevance are relevant phrases; and group content items into topics based on presence of relevant phrases within the content items of the hit list; and a multidimensional schema manager to manage a multidimensional schema comprising a schema of a fact table and schemata of static dimensions, wherein dynamic dimensions of the multidimensional schema are populated in response to the exploration server returning the hit list, and wherein dynamic dimensions of the multidimensional schema are identified in response to the exploration server returning the hit list based upon content dynamically extracted from the subset of content items identified in the hit list.
-
Specification