Methods and systems for enhancing metadata
First Claim
Patent Images
1. A system for enhancing metadata, the system comprising:
- a metadata obtaining section that obtains metadata of a file that includes media; and
a digital storage storing processing instructions;
one or more processing sections that execute the processing instructions so as to implement functions of an extraction agent that;
separates noisy items of the metadata into keywords,performs, using separated noisy items of the metadata, a full-text query against the potential ground truth database,calculates a score quantifying a degree of similarity between the separated noisy items of metadata and potential ground truth data, andqualifies the potential ground truth database, based on a comparison of the calculated score to a threshold score, as a ground truth database;
identifies, in the ground truth database, valid metadata that at least partially matches the obtained metadata; and
modifies the obtained metadata using at least a portion of the valid metadata so as to generate enhanced metadata, bycomparing contents in the obtained metadata with corresponding contents in the valid metadata,identifying contents in the valid metadata that are not in the obtained metadata, andadding the identified contents in the valid metadata to the obtained metadata.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and system for utilizing metadata to search for media, such as multimedia and streaming media, includes searching for the media, receiving results, extracting metadata associated with the media, enhancing the extracted metadata, and grouping the search results in accordance with attributes of the enhanced metadata. Enhancing and grouping include adding related metadata to the database of metadata, iteratively using metadata to search for more media related data, removing duplicate URLs, collapsing URLs that are variants of each other, and masking out superfluous terms from URLs. The resultant metadata and media files are available to users and search engines.
179 Citations
16 Claims
-
1. A system for enhancing metadata, the system comprising:
-
a metadata obtaining section that obtains metadata of a file that includes media; and a digital storage storing processing instructions; one or more processing sections that execute the processing instructions so as to implement functions of an extraction agent that; separates noisy items of the metadata into keywords, performs, using separated noisy items of the metadata, a full-text query against the potential ground truth database, calculates a score quantifying a degree of similarity between the separated noisy items of metadata and potential ground truth data, and qualifies the potential ground truth database, based on a comparison of the calculated score to a threshold score, as a ground truth database; identifies, in the ground truth database, valid metadata that at least partially matches the obtained metadata; and modifies the obtained metadata using at least a portion of the valid metadata so as to generate enhanced metadata, by comparing contents in the obtained metadata with corresponding contents in the valid metadata, identifying contents in the valid metadata that are not in the obtained metadata, and adding the identified contents in the valid metadata to the obtained metadata. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of enhancing metadata, the method comprising:
-
separating noisy items of the metadata into keywords, performing, using separated noisy items of the metadata, a full-text query against the potential ground truth database, calculating a score quantifying a degree of similarity between the separated noisy items of metadata and potential ground truth data, and qualifying the potential ground truth database, based on a comparison of the calculated score to a threshold score, as a ground truth database; identifying, in the ground truth database, valid metadata that at least partially matches obtained metadata of a file that includes media; and modifying the obtained metadata using at least a portion of the valid metadata so as to generate enhanced metadata, by comparing contents in the obtained metadata with corresponding contents in the valid metadata, identifying contents in the valid metadata that are not in the obtained metadata, and adding the identified contents in the valid metadata to the obtained metadata. - View Dependent Claims (7, 8, 9)
-
-
10. The system of method 6, wherein the identifying valid metadata includes:
-
calculating a score based on a degree of similarity between the obtained metadata and the valid metadata; and determining whether the valid metadata at least partially matches the obtained metadata based at least in part on the calculated score.
-
-
11. A device for enhancing metadata, the device comprising:
-
a metadata obtainer that obtains metadata of a file that includes media; a digital storage that stores processing instructions; and one or more hardware processors that execute the processing instructions so as to provide functions of an extractor that; separates noisy items of the metadata into keywords, performs, using separated noisy items of the metadata, a full-text query against the potential ground truth database, calculates a score quantifying a degree of similarity between the separated noisy items of metadata and potential ground truth data, and qualifies the potential ground truth database, based on a comparison of the calculated score to a threshold score, as a ground truth database; identifies, in the ground truth database, valid metadata that at least partially matches the obtained metadata; and enhances the obtained metadata using at least a portion of the valid metadata so as to generate enhanced metadata, by comparing contents in the obtained metadata with corresponding contents in the valid metadata, identifying contents in the valid metadata that are not in the obtained metadata, and adding the identified contents in the valid metadata to the obtained metadata. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A device for enhancing metadata, the device comprising:
-
a metadata obtainer that obtains metadata of a file that includes media; a digital storage that stores processing instructions; one or more hardware processors that execute the processing instructions so as to provide functions of a potential ground truth database qualifier that separates noisy items of the metadata into keywords, performing, using separated noisy items of the metadata, a full-text query against the potential ground truth database, calculating a score quantifying a degree of similarity between the separated noisy items of metadata and potential ground truth data, and qualifying the potential ground truth database, based on a comparison of the calculated score to a threshold score, as a ground truth database; and one or more hardware processors that execute the processing instructions so as to provide functions of an extractor that; identifies, in the ground truth database, valid metadata that at least partially matches the obtained metadata; and enhances the obtained metadata using at least a portion of the valid metadata so as to generate enhanced metadata, by comparing contents in the obtained metadata with corresponding contents in the valid metadata, identifying contents in the valid metadata that do not match contents in the obtained metadata, and replacing the non-matching contents in the obtained metadata with the identified contents in the valid metadata.
-
Specification