Training image-recognition systems using a joint embedding model on online social networks
First Claim
1. A method comprising, by one or more computing systems:
- identifying a shared visual concept in two or more visual-media items, wherein each visual-media item comprises one or more images, each image comprising one or more visual features, and wherein each visual-media item comprises one or more visual concepts, the shared visual concept being identified based on one or more shared visual features in the respective images of the visual-media items;
extracting, for each of the visual-media items, one or more n-grams from one or more communications associated with the visual-media item;
generating, in a d-dimensional space, an embedding for each of the visual-media items, wherein a location of the embedding for the visual-media item is based on the one or more visual concepts included in the visual-media item;
generating, in the d-dimensional space, an embedding for each of the extracted n-grams, wherein a location of the embedding for the n-gram is based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items;
associating with the shared visual concept, one or more of the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items;
populating a visual-concept index that indexes visual concepts with their respective associated n-grams;
receiving, from a client system of a user, a search query comprising one or more n-grams;
determining, based on the visual-concept index, one or more visual concepts associated with the n-grams of the search query; and
sending, to the client system of the user, one or more search results comprising visual-media items in which the determined visual concepts are identified.
2 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes identifying a shared visual concept in visual-media items based on shared visual features in images of the visual-media items; extracting, for each of the visual-media items, n-grams from communications associated with the visual-media item; generating, in a d-dimensional space, an embedding for each of the visual-media items at a location based on the visual concepts included in the visual-media item; generating, in the d-dimensional space, an embedding for each of the extracted n-grams at a location based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; and associating, with the shared visual concept, the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items.
198 Citations
20 Claims
-
1. A method comprising, by one or more computing systems:
-
identifying a shared visual concept in two or more visual-media items, wherein each visual-media item comprises one or more images, each image comprising one or more visual features, and wherein each visual-media item comprises one or more visual concepts, the shared visual concept being identified based on one or more shared visual features in the respective images of the visual-media items; extracting, for each of the visual-media items, one or more n-grams from one or more communications associated with the visual-media item; generating, in a d-dimensional space, an embedding for each of the visual-media items, wherein a location of the embedding for the visual-media item is based on the one or more visual concepts included in the visual-media item; generating, in the d-dimensional space, an embedding for each of the extracted n-grams, wherein a location of the embedding for the n-gram is based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; associating with the shared visual concept, one or more of the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items; populating a visual-concept index that indexes visual concepts with their respective associated n-grams; receiving, from a client system of a user, a search query comprising one or more n-grams; determining, based on the visual-concept index, one or more visual concepts associated with the n-grams of the search query; and sending, to the client system of the user, one or more search results comprising visual-media items in which the determined visual concepts are identified. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
identify a shared visual concept in two or more visual-media items, wherein each visual-media item comprises one or more images, each image comprising one or more visual features, and wherein each visual-media item comprises one or more visual concepts, the shared visual concept being identified based on one or more shared visual features in the respective images of the visual-media items; extract, for each of the visual-media items, one or more n-grams from one or more communications associated with the visual-media item; generate, in a d-dimensional space, an embedding for each of the visual-media items, wherein a location of the embedding for the visual-media item is based on the one or more visual concepts included in the visual-media item; generate, in the d-dimensional space, an embedding for each of the extracted n-grams, wherein a location of the embedding for the n-gram is based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; associate, with the shared visual concept, one or more of the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items; populate a visual-concept index that indexes visual concepts with their respective associated n-grams; receive, from a client system of a user, a search query comprising one or more n-grams;
determine, based on the visual-concept index, one or more visual concepts associated with the n-grams of the search query; andsend, to the client system of the user, one or more search results comprising visual-media items in which the determined visual concepts are identified. - View Dependent Claims (16, 17, 19, 20)
-
-
18. A system comprising:
- one or more processors; and
a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;identify a shared visual concept in two or more visual-media items, wherein each visual-media item comprises one or more images, each image comprising one or more visual features, and wherein each visual-media item comprises one or more visual concepts, the shared visual concept being identified based on one or more shared visual features in the respective images of the visual-media items; extract, for each of the visual-media items, one or more n-grams from one or more communications associated with the visual-media item; generate, in a d-dimensional space, an embedding for each of the visual-media items, wherein a location of the embedding for the visual-media item is based on the one or more visual concepts included in the visual-media item; generate, in the d-dimensional space, an embedding for each of the extracted n-grams, wherein a location of the embedding for the n-gram is based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; associate with the shared visual concept, one or more of the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items; populate a visual-concept index that indexes visual concepts with their respective associated n-grams; receive, from a client system of a user, a search query comprising one or more n-grams; determine, based on the visual-concept index, one or more visual concepts associated with the n-grams of the search query; and send, to the client system of the user, one or more search results comprising visual-media items in which the determined visual concepts are identified.
- one or more processors; and
Specification