TEXTUAL ANNOTATION OF ACOUSTIC EFFECTS
First Claim
Patent Images
1. A system for enhancing the accessibility of Audio Visual content, the system comprising:
- an acoustic effects annotation module configured classify primary audio events occurring within an audio segment to generate one or more tags describing the primary audio events occurring within the audio segment.
1 Assignment
0 Petitions
Accused Products
Abstract
Accommodation for color or visual impairments may be implemented by selective color substitution. A color accommodation module receives an image frame from a host system and generates a color-adapted version of the image frame. The color accommodation module may include a rule based filter that substitutes one or more colors within the image frame with one or more corresponding alternative colors.
-
Citations
17 Claims
-
1. A system for enhancing the accessibility of Audio Visual content, the system comprising:
an acoustic effects annotation module configured classify primary audio events occurring within an audio segment to generate one or more tags describing the primary audio events occurring within the audio segment.
-
2. The system of claim 1 wherein the one or more primary audio events include a top three most important sounds within the audio segment.
-
3. The system of claim 1 wherein the audio segment is a clip of video game audio having multiple sounds associated with multiple sources.
-
4. The system of claim 1 wherein the acoustic effects annotation module includes a neural network configured to classify the primary acoustic effects occurring within the audio segment and wherein the neural network is trained with both supervised and unsupervised learning techniques.
-
5. The system of claim 1 wherein a duration of the audio segment is less than or equal to a time for the neural network to classify the primary acoustic effects occurring within the audio segment.
-
6. The system of claim 1, further comprising a controller coupled to the acoustic effects annotation module, wherein the controller is configured to provide the one or more tags to a host system for display on a display screen and synchronize the output of the acoustic effects annotation module with one or more other neural network modules.
-
7. The system of claim 6 wherein the one or more other neural network modules includes a Graphical Style Modification module configured to apply a style adapted from a reference image frame to a source image frame wherein the source image frame is synchronized to appear during the audio segment.
-
8. The system of claim 1, further comprising a controller coupled to the host system and the action description module, wherein the controller is configured to synchronize presentation of text corresponding to the one or more tags with display of a sequence of image frames associated with the audio segment.
-
9. A method for enhancing accessibility of Audio Visual content, comprising:
classifying primary acoustic effects occurring within an audio segment to generate one or more tags describing the primary acoustic effects occurring within the audio segment with an acoustic effects annotation module.
-
10. The method of claim 9 wherein the one or more primary audio events include a top three most important sounds within the audio segment.
-
11. The method of claim 9 wherein the audio segment is a clip of video game audio having multiple sounds associated with multiple sources.
-
12. The method of claim 9 wherein classifying primary audio events occurring within the audio segment with the audio description module includes using a neural network to classify the primary audio events occurring within the audio segment and wherein the neural network is trained with both supervised and unsupervised learning techniques.
-
13. The method of claim 9 wherein a duration of the audio segment is less than or equal to a time for the neural network to classify the primary audio events occurring within the audio segment.
-
14. The method of claim 9, further comprising providing the one or more tags to a host system for display on a display screen and synchronizing the output of the audio description module with one or more other neural network modules with a controller coupled to the audio description module.
-
15. The method of claim 14 wherein the one or more other neural network modules includes a Graphical Style Modification module configured to apply a style adapted from a reference image frame to a source image frame wherein the source image frame is synchronized to appear during the audio segment.
-
16. The method of claim 9, further comprising a controller coupled to the host system and the action description module, wherein the controller is configured to synchronize presentation of text corresponding to the one or more tags with display of a sequence of image frames associated with the audio segment.
-
17. A non-transitory computer-readable medium having computer readable instructions embodied therein, the instructions being configured upon execution to implement a method for enhancing accessibility of Audio Visual content, the method comprising classifying primary audio events occurring within an audio segment to generate one or more tags describing the primary audio events occurring within the audio segment with an audio description module.
Specification