AUTOMATED CATEGORIZATION OF SEMI-STRUCTURED DATA
First Claim
Patent Images
1. A method, comprising:
- receiving metadata associated with media content;
generating a search vector using keywords associated with the media content;
determining a plurality distances between the search vector and a plurality of category vectors in an inverse vector space search engine matrix;
categorizing the media content using the plurality of distances between the search vector and the plurality of category vectors.
1 Assignment
0 Petitions
Accused Products
Abstract
Mechanisms are provided for generating an inverse vector space search engine to automatically categorize and/or tag semi-structured data. In particular examples, an inverse vector space search engine includes multiple genres each associated with multiple keywords. Metadata such as media content description, caption information, review information, etc., are identified to determine distance between the media content and the various genres. Genres having a closer distance to media content are determined to be genres more closely describing the media content. Post filtering, alternate category determination, and user profiling may also be applied to the results.
16 Citations
20 Claims
-
1. A method, comprising:
-
receiving metadata associated with media content; generating a search vector using keywords associated with the media content; determining a plurality distances between the search vector and a plurality of category vectors in an inverse vector space search engine matrix; categorizing the media content using the plurality of distances between the search vector and the plurality of category vectors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system, comprising:
-
an interface configured to receive metadata associated with media content; a processor configured to generate a search vector using keywords associated with the media content, determine a plurality distances between the search vector and a plurality of category vectors in an inverse vector space search engine matrix, and categorize the media content using the plurality of distances between the search vector and the plurality of category vectors. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer readable storage medium having computer code embodied therein, the computer readable storage medium comprising:
-
computer code for receiving metadata associated with media content; computer code for generating a search vector using keywords associated with the media content; computer code for determining a plurality distances between the search vector and a plurality of category vectors in an inverse vector space search engine matrix; computer code for categorizing the media content using the plurality of distances between the search vector and the plurality of category vectors.
-
Specification