System and method for rich media annotation
First Claim
1. A method comprising:
- receiving, at a processor, first media content for annotation by the processor based on an associated audio annotation;
identifying, by the processor, the audio annotation, wherein the audio annotation is stored within a digital file;
generating, via the processor, text from the audio annotation using a grammar;
parsing, via the processor, the text to identify first metadata indicative of one or more of a location, an individual or an activity;
based on weights assigned to the first metadata, identifying, via the processor, second media content having an associated weighted second metadata corresponding to the first metadata;
generating, via the processor, at least one descriptive annotation for the first media content based on the first metadata and the second metadata;
receiving, by the processor, additional metadata via a dialog with a user who recorded the first media content;
annotating, via the processor, the first media content using the at least one descriptive annotation and the additional metadata to yield annotated first media content; and
providing the first annotated media content to users.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and computer readable-media for rich media annotation, the method comprising receiving a first recorded media content, receiving at least one audio annotation about the first recorded media, extracting metadata from the at least one of audio annotation, and associating all or part of the metadata with the first recorded media content. Additional data elements may also be associated with the first recorded media content. Where the audio annotation is a telephone conversation, the recorded media content may be captured via the telephone. The recorded media content, audio annotations, and/or metadata may be stored in a central repository which may be modifiable. Speech characteristics such as prosody may be analyzed to extract additional metadata. In one aspect, a specially trained grammar identifies and recognizes metadata.
61 Citations
18 Claims
-
1. A method comprising:
-
receiving, at a processor, first media content for annotation by the processor based on an associated audio annotation; identifying, by the processor, the audio annotation, wherein the audio annotation is stored within a digital file; generating, via the processor, text from the audio annotation using a grammar; parsing, via the processor, the text to identify first metadata indicative of one or more of a location, an individual or an activity; based on weights assigned to the first metadata, identifying, via the processor, second media content having an associated weighted second metadata corresponding to the first metadata; generating, via the processor, at least one descriptive annotation for the first media content based on the first metadata and the second metadata; receiving, by the processor, additional metadata via a dialog with a user who recorded the first media content; annotating, via the processor, the first media content using the at least one descriptive annotation and the additional metadata to yield annotated first media content; and providing the first annotated media content to users. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving first media content for annotation by the processor based on an associated audio annotation; identifying the audio annotation, wherein the audio annotation is stored within a digital file; generating text from the audio annotation using a grammar; parsing the text to identify first metadata indicative of one or more of a location, an individual or an activity; based on weights assigned to the first metadata, identifying second media content having an associated weighted second metadata corresponding to the first metadata; generating, at least one descriptive annotation for the first media content based on the first metadata and the second metadata; receiving additional metadata via a dialog with a user who recorded the first media content; annotating, the first media content using the at least one descriptive annotation and the additional metadata to yield annotated first media content; and providing the first annotated media content to users. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage device having instructions for performing annotation stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving first media content for annotation by the processor based on an associated audio annotation; identifying the audio annotation, wherein the audio annotation is stored within a digital file; generating text from the audio annotation using a grammar; parsing the text to identify first metadata indicative of one or more of a location, an individual or an activity; based on weights assigned to the first metadata, identifying second media content having an associated weighted second metadata corresponding to the first metadata; generating, at least one descriptive annotation for the first media content based on the first metadata and the second metadata; receiving additional metadata via a dialog with a user who recorded the first media content; annotating the first media content using the at least one descriptive annotation and the additional metadata to yield annotated first media content; and providing the first annotated media content to users. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification