Information processing device, information processing method and program
First Claim
1. An information processing device, comprising:
- a learning module configured toextract an image feature amount of each frame of an image of learning content and extracting word frequency information regarding a frequency of appearance of each word in a description text describing a content of the image of the learning content as a text feature amount of the description text,wherein the learning content includes a text of a caption, andwherein the description text is the text of the caption included in the learning content, andlearn an annotation model, which is a multi-stream HMM (Hidden Markov Model), by using an annotation sequence for annotation, which is a multi-stream including the image feature amount and the text feature amount; and
a browsing controller configured to extract a scene, which is a group of one or more temporally continuous frames, from target content from which the scene is to be extracted by using the annotation model, anddisplay representative images of scenes so as to be arranged in chronological order.
1 Assignment
0 Petitions
Accused Products
Abstract
An information processing device, method and a program adds an annotation to content and provides an application which utilizes the annotation. A learning module extracts an image feature amount of each frame of an image of learning content and extracts word frequency information regarding a frequency of appearance of each word in a description text, and learns an annotation models which is a multi-stream Hidden Markov Model (HMM) by using a multi-stream including the image feature amount and the text feature amount. A browsing controller extracts a scene which is a group of one or more temporally continuous frames from target content by using the annotation model and displays representative images of the scenes so as to be arranged in chronological order.
-
Citations
18 Claims
-
1. An information processing device, comprising:
-
a learning module configured to extract an image feature amount of each frame of an image of learning content and extracting word frequency information regarding a frequency of appearance of each word in a description text describing a content of the image of the learning content as a text feature amount of the description text, wherein the learning content includes a text of a caption, and wherein the description text is the text of the caption included in the learning content, and learn an annotation model, which is a multi-stream HMM (Hidden Markov Model), by using an annotation sequence for annotation, which is a multi-stream including the image feature amount and the text feature amount; and a browsing controller configured to extract a scene, which is a group of one or more temporally continuous frames, from target content from which the scene is to be extracted by using the annotation model, and display representative images of scenes so as to be arranged in chronological order. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An information processing method to be performed by an information processing device, comprising the steps of:
-
extracting an image feature amount of each frame of an image of learning content and extracting word frequency information regarding a frequency of appearance of each word in a description text describing a content of the image of the learning content as a text feature amount of the description text, wherein the learning content includes a text of a caption, and wherein the description text is the text of the caption included in the learning content; learning an annotation models which is a multi-stream HMM (Hidden Markov Model), by using an annotation sequence for annotation, which is a multi-stream including the image feature amount and the text feature amount; extracting a scene, which is a group of one or more temporally continuous frames, from target content from which the scene is to be extracted by using the annotation model; and displaying representative images of scenes so as to be arranged in chronological order.
-
-
18. A non-transitory computer-readable medium having a set of computer-executable instructions embodied thereon to perform a method in a computing device comprising:
-
extract an image feature amount of each frame of an image of learning content and extracting word frequency information regarding a frequency of appearance of each word in a description text describing a content of the image of the learning content as a text feature amount of the description text, wherein the learning content includes a text of a caption, and wherein the description text is the text of the caption included in the learning content; learn an annotation model, which is a multi-stream HMM (Hidden Markov Model), by using an annotation sequence for annotation, which is a multi-stream including the image feature amount and the text feature amount; extract a scene, which is a group of one or more temporally continuous frames, from target content from which the scene is to be extracted by using the annotation model; and display representative images of scenes so as to be arranged in chronological order.
-
Specification