Using a community generated web site for metadata
First Claim
Patent Images
1. A computerized method comprising:
- receiving a web page from a community-generated web site, the web page associated with a particular piece of content;
extracting a plurality of terms from the web page;
adding the plurality of terms to content metadata associated with the piece of content;
extracting specific category data from the content metadata;
loading the specific category data into a category dataset; and
reducing a dimensionality of the category dataset based on the category dataset and relation data, wherein the relation data defines a relationship between the category dataset and the content associated with the category dataset.
2 Assignments
0 Petitions
Accused Products
Abstract
A category dataset includes names of categories and relation data, where the relation data defines a relationship between the categories and content. The categories for the content are generated by retrieving a web page from a an online community generated web site, such as the, WIKIPEDIA web site, associated with a particular piece of content and analyzing the web page for content metadata. The category data for that piece of content is extracted from the content metadata. In addition, the terms in category dataset are reduced based on the categories and the relation data.
63 Citations
16 Claims
-
1. A computerized method comprising:
-
receiving a web page from a community-generated web site, the web page associated with a particular piece of content; extracting a plurality of terms from the web page; adding the plurality of terms to content metadata associated with the piece of content; extracting specific category data from the content metadata; loading the specific category data into a category dataset; and reducing a dimensionality of the category dataset based on the category dataset and relation data, wherein the relation data defines a relationship between the category dataset and the content associated with the category dataset. - View Dependent Claims (2, 3, 4)
-
-
5. A machine readable medium comprising:
-
receiving a web page from a community-generated web site, the web page associated with a particular piece of content; extracting a plurality of terms from the web page; adding the plurality of terms to content metadata associated with the piece of content; extracting specific category data from the content metadata; loading the specific category data into a category dataset; and reducing a dimensionality of the category dataset based on the category dataset and relation data, wherein the relation data defines a relationship between the category dataset and the content associated with the category dataset. - View Dependent Claims (6, 7, 8)
-
-
9. An apparatus comprising:
-
means for receiving a web page from a community-generated web site, the web page associated with a particular piece of content; means for extracting a plurality of terms from the web page; means for adding the plurality of terms to content metadata associated with the piece of content; means for extracting specific category data from the content metadata; means for loading the specific category data into a category dataset; and means for reducing a dimensionality of the category dataset based on the category dataset and relation data, wherein the relation data defines a relationship between the category dataset and the content associated with the category dataset. - View Dependent Claims (10, 11, 12)
-
-
13. A system comprising:
-
a processor; a memory coupled to the processor though a bus; and a process executed from the memory by the processor to cause the processor to receive a web page from a community-generated web site, the web page associated with a particular piece of content, to extract a plurality of terms from the web page, to add the plurality of terms to content metadata associated with the piece of content, to extract specific category data from the content metadata, to load the specific category data into a category dataset, and reducing a dimensionality of the category dataset based on the category dataset and relation data, wherein the relation data defines a relationship between the category dataset and the content associated with the category dataset. - View Dependent Claims (14, 15, 16)
-
Specification