Audio tagging
First Claim
Patent Images
1. A method implemented by a data processing apparatus, the method comprising:
- obtaining, at one or more processors, an audio message associated with one or more image files, wherein the obtaining comprises;
detecting that a first image file is being displayed on a device associated with a user,determining a first period of time when the first image file is displayed on the device associated with the user, andtime stamping the obtained audio message;
processing, at the one or more processors, the audio message using speech recognition technology to detect a text component of the audio message;
determining, at the one or more processors, one or more textual tags for the one or more image files based on the detected text component, wherein the determining comprises determining a first portion of the detected text component corresponding to the first period of time using the time stamps of the obtained audio message and identifying a first set of the one or more textual tags that were determined based on the first portion of the detected text component; and
assigning, at the one or more processors, the one or more textual tags to the one or more image files, wherein the assigning comprises assigning one or more of the textual tags from the first set of the one or more textual tags to the first image file.
7 Assignments
0 Petitions
Accused Products
Abstract
Various embodiments are provided for enabling audio tagging of image files. The audio messages are obtained by the system, usually by recording an audio message from a user, and then converted into a textual tag, using speech recognition technology. In some implementations semantic analysis of text component of these massages is performed. In some implementations the textual tags are then propagated to other image files associated with the user.
-
Citations
27 Claims
-
1. A method implemented by a data processing apparatus, the method comprising:
-
obtaining, at one or more processors, an audio message associated with one or more image files, wherein the obtaining comprises; detecting that a first image file is being displayed on a device associated with a user, determining a first period of time when the first image file is displayed on the device associated with the user, and time stamping the obtained audio message; processing, at the one or more processors, the audio message using speech recognition technology to detect a text component of the audio message; determining, at the one or more processors, one or more textual tags for the one or more image files based on the detected text component, wherein the determining comprises determining a first portion of the detected text component corresponding to the first period of time using the time stamps of the obtained audio message and identifying a first set of the one or more textual tags that were determined based on the first portion of the detected text component; and assigning, at the one or more processors, the one or more textual tags to the one or more image files, wherein the assigning comprises assigning one or more of the textual tags from the first set of the one or more textual tags to the first image file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a machine-readable storage device having instructions stored thereon; and a data processing apparatus in communication with the machine-readable storage device and operable to execute the instructions to perform operations comprising; obtaining an audio message associated with one or more image files, wherein the obtaining comprises detecting that a first image file is being displayed on a device associated with a user, determining a first period of time when the first image file is displayed on the device associated with the user, and time stamping the obtained audio message; processing the audio message using speech recognition technology to detect a text component of the audio message; determining one or more textual tags for the one or more image files based on the detected text component, wherein the determining comprises determining a first portion of the detected text component corresponding to the first period of time using the time stamps of the obtained audio message and identifying a first set of the one or more textual tags that were determined based on the first portion of the detected text component; and assigning the one or more textual tags to the one or more image files, wherein the assigning comprises assigning one or more of the textual tags from the first set of the one or more textual tags to the first image file. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A storage device having instructions stored thereon that, when executed by a data processing apparatus, cause the data processing apparatus to perform operations comprising:
-
obtaining an audio message associated with one or more image files, wherein the obtaining comprises detecting that a first image file is being displayed on a device associated with a user, determining a first period of time when the first image file is displayed on the device associated with the user, and time stamping the obtained audio message; processing the audio message using speech recognition technology to detect a text component of the audio message; determining one or more textual tags for the one or more image files based on the detected text component, wherein the determining comprises determining a first portion of the detected text component corresponding to the first period of time using the time stamps of the obtained audio message and identifying a first set of the one or more textual tags that were determined based on the first portion of the detected text component; and assigning the one or more textual tags to the one or more image files, wherein the assigning comprises assigning one or more of the textual tags from the first set of the one or more textual tags to the first image file. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification