Hybrid audio-visual categorization system and method
First Claim
1. A fully-automatic method of producing a set of tags for an input audiovisual file, the set of tags indicating values of a plurality of attributes of an audiovisual work of a defined type represented by said audiovisual file, the method comprising:
- analyzing properties of the audiovisual work represented by said input audiovisual file and evaluating a set of one or more features characterizing said properties of said audiovisual work;
providing an initial estimate of said set of tags by automatically converting said set of features evaluated in the analyzing step to said initial estimate based on first correlations between physical properties of audiovisual works of said defined type and tags applicable to audiovisual works of said defined type;
automatically applying, to the tags of said initial estimate, a set of one or more correlation functions defining a correlation among different tags of a set of training examples, to produce a revised tag estimate, said training examples being audiovisual works of said defined type corresponding to manually-tagged audiovisual files; and
outputting a final result of the applying step as the set of tags for said input audiovisual file;
wherein the correlation-function application step applies said correlation functions selectively to the tags of said initial estimate by applying said correlation functions to tags having a correlation with the physical properties of audiovisual works of said defined type and not applying said correlation functions to tags that are poorly correlated with the physical properties of audiovisual works of said defined type.
1 Assignment
0 Petitions
Accused Products
Abstract
Meta-data (tags) for an audiovisual file can be generated by producing an initial estimate of the tags and then revising the estimate (notably to expand it and/or render it more precise) based on the assumption that the relationships which hold between the different tags for a set of manually-tagged training examples will also hold for the tags of the input file now being tagged. A fully-automatic method and system is a hybrid between signal-based and machine-learning approaches, because the initial tag estimate is based on the physical properties of the signal representing the audiovisual file. The initial tag estimate may be produced by inferring that the input content will have the same tags as those files of the same kind, in the training database, which have a global similarity to the input audiovisual file in terms of signal properties.
-
Citations
16 Claims
-
1. A fully-automatic method of producing a set of tags for an input audiovisual file, the set of tags indicating values of a plurality of attributes of an audiovisual work of a defined type represented by said audiovisual file, the method comprising:
-
analyzing properties of the audiovisual work represented by said input audiovisual file and evaluating a set of one or more features characterizing said properties of said audiovisual work; providing an initial estimate of said set of tags by automatically converting said set of features evaluated in the analyzing step to said initial estimate based on first correlations between physical properties of audiovisual works of said defined type and tags applicable to audiovisual works of said defined type; automatically applying, to the tags of said initial estimate, a set of one or more correlation functions defining a correlation among different tags of a set of training examples, to produce a revised tag estimate, said training examples being audiovisual works of said defined type corresponding to manually-tagged audiovisual files; and outputting a final result of the applying step as the set of tags for said input audiovisual file; wherein the correlation-function application step applies said correlation functions selectively to the tags of said initial estimate by applying said correlation functions to tags having a correlation with the physical properties of audiovisual works of said defined type and not applying said correlation functions to tags that are poorly correlated with the physical properties of audiovisual works of said defined type. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer readable medium storing a computer program having a set of instructions which, executed by a computer apparatus, cause the computer apparatus to perform a fully-automatic method of producing a set of tags for an input audiovisual file, the set of tags indicating values of a plurality of attributes of an audiovisual work of defined type represented by said audiovisual file, the method comprising:
-
analyzing properties of the audiovisual work represented by said input audiovisual file and evaluating a set of one or more features characterizing said properties of said audiovisual work; providing an initial estimate of said set of tags by automatically converting said set of features evaluated in the analyzing step to said initial estimate based on first correlations between physical properties of audiovisual works of said defined type and tags applicable to audiovisual works of said defined type; automatically applying, to the tags of said initial estimate, a set of one or more correlation functions defining a correlation among different tags of a set of training examples, to produce a revised tag estimate, said training examples being audiovisual works of said defined type corresponding to manually-tagged audiovisual files; and outputting a final result of the applying step as the set of tags for said input audiovisual file; wherein the correlation-function application step applies said correlation functions selectively to the tags of said initial estimate by applying said correlation functions to tags having a correlation with the physical properties of audiovisual works of said defined type and not applying said correlation functions to tags that are poorly correlated with the physical properties of audiovisual works of said defined type. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A fully-automatic audiovisual-file-tagging system implemented by an information processing apparatus configured to output a set of tags for an input audiovisual file that indicate values of a plurality of attributes of an audiovisual work of a defined type represented by said audiovisual file, the system comprising:
-
an analyzing unit configured to analyze properties of the audiovisual work represented by said input audiovisual file and evaluate a set of one or more features that characterize said properties of said audiovisual work; an initial estimate providing unit configured to provide an initial estimate of said set of tags by the automatic conversion of said set of features evaluated by the analyzing unit to said initial estimate based on first correlations between physical properties of audiovisual works of said defined type and tags applicable to audiovisual works of said defined type; a correlation function application unit configured to automatically apply, to the tags of said initial estimate, a set of one or more correlation functions configured to define a correlation among different tags of a set of training examples, to produce a revised tag estimate, wherein said training examples are audiovisual works of said defined type that correspond to manually-tagged audiovisual files; and a final result outputting unit configured to output a final result of the correlation function application unit as the set of tags for said input audiovisual file; wherein the correlation function application unit is further configured to apply said correlation functions selectively to the tags of said initial estimate by applying said correlation functions to tags having a correlation with the physical properties of audiovisual works of said defined type and not applying said correlation functions to tags that are poorly correlated with the physical properties of audiovisual works of said defined type. - View Dependent Claims (12, 13, 14, 15, 16)
-
Specification