MACHINE LEARNING IMAGE PROCESSING
First Claim
1. A machine learning image processing system comprising:
- a data repository storing images and tags for each image, wherein the tags for each image describe attributes of an object in the image;
a network interface to connect the machine learning image processing system to at least one network;
at least one processor to execute machine readable instructions stored on at least one non-transitory computer readable medium;
at least one data storage to store a plurality of image attribute machine learning classifiers,wherein the plurality of image attribute machine learning classifiers comprise convolutional neural networks trained to identify the attributes;
wherein the machine readable instructions comprise machine readable instructions for an auto-tagging subsystem, and the at least one processor is to execute the machine readable instructions for the auto-tagging subsystem to;
apply each image stored in the data repository to the plurality of image attribute machine learning classifiers;
determine predictions for a plurality of image attribute categories from outputs of the plurality of image attribute machine learning classifiers;
determine the attributes of the object in each image stored in the data repository from the predictions; and
tag each image stored in the data repository with the determined attributes for the object in the image.wherein the machine readable instructions comprise machine readable instructions for an image matching subsystem, and the at least one processor is to execute the machine readable instructions for the image matching subsystem to;
receive, via the network interface, a target image from a mobile application connected to the machine learning image processing system via the at least one network;
receive, via the network interface, supplemental user input associated with the target image from the mobile application connected to the image processing computer via the at least one network;
apply the target image to the plurality of image attribute machine learning classifiers;
determine predictions for the plurality of image attribute categories from applying the target image to the plurality of image attribute machine learning classifiers; and
determine target image attributes for an object in the target image from the predictions for the target image determined by the plurality of image attribute machine learning classifiers;
apply the supplemental user input to a natural language processing model to determine at least one supplemental image search attribute;
identify a matching subset of the images stored in the data repository that match the target image based on image search attributes determined from the target image attributes and the at least one supplemental image search attribute; and
transmit, via the network interface, the matching subset of images to the mobile application for display by the mobile application.
1 Assignment
0 Petitions
Accused Products
Abstract
A machine learning image processing system performs natural language processing (NLP) and auto-tagging for an image matching process. The system facilitates an interactive process, e.g., through a mobile application, to obtain an image and supplemental user input from a user to execute an image search. The supplemental user input may be provided from a user as speech or text, and NLP is performed on the supplemental user input to determine user intent and additional search attributes for the image search. Using the user intent and the additional search attributes, the system performs image matching on stored images that are tagged with attributes through an auto-tagging process.
46 Citations
20 Claims
-
1. A machine learning image processing system comprising:
-
a data repository storing images and tags for each image, wherein the tags for each image describe attributes of an object in the image; a network interface to connect the machine learning image processing system to at least one network; at least one processor to execute machine readable instructions stored on at least one non-transitory computer readable medium; at least one data storage to store a plurality of image attribute machine learning classifiers, wherein the plurality of image attribute machine learning classifiers comprise convolutional neural networks trained to identify the attributes; wherein the machine readable instructions comprise machine readable instructions for an auto-tagging subsystem, and the at least one processor is to execute the machine readable instructions for the auto-tagging subsystem to; apply each image stored in the data repository to the plurality of image attribute machine learning classifiers; determine predictions for a plurality of image attribute categories from outputs of the plurality of image attribute machine learning classifiers; determine the attributes of the object in each image stored in the data repository from the predictions; and tag each image stored in the data repository with the determined attributes for the object in the image. wherein the machine readable instructions comprise machine readable instructions for an image matching subsystem, and the at least one processor is to execute the machine readable instructions for the image matching subsystem to; receive, via the network interface, a target image from a mobile application connected to the machine learning image processing system via the at least one network; receive, via the network interface, supplemental user input associated with the target image from the mobile application connected to the image processing computer via the at least one network; apply the target image to the plurality of image attribute machine learning classifiers; determine predictions for the plurality of image attribute categories from applying the target image to the plurality of image attribute machine learning classifiers; and determine target image attributes for an object in the target image from the predictions for the target image determined by the plurality of image attribute machine learning classifiers; apply the supplemental user input to a natural language processing model to determine at least one supplemental image search attribute; identify a matching subset of the images stored in the data repository that match the target image based on image search attributes determined from the target image attributes and the at least one supplemental image search attribute; and transmit, via the network interface, the matching subset of images to the mobile application for display by the mobile application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A visual recommendation system comprising:
-
a data repository storing images, wherein the stored images include meta data comprised of tags describing attributes of the stored images, and wherein the tags are determined from applying the stored images to a plurality of image attribute machine learning classifiers classifying the stored images in classes for the tags; at least one processor to; receive a digital image of an object of interest to a user; apply the digital image to a first machine learning classifier to identify the object; apply an image of the object to a second machine learning classifier to determine attributes of the image of the object; determine an initial subset of images stored in the data repository that are visually similar to the object based on a comparison of the attributes of the image of the object and attributes of the images; receive supplemental user input associated with the initial subset of images and the object; apply the supplemental user input to a natural language processing model to determine at least one supplemental image search attribute; determine object search criteria from the at least one supplemental image search attribute and at least one of the attributes of the image of the object and the attributes of the initial subset of images; search the tags in the meta data of the stored images according to the object search criteria to identify a matching subset of the images stored in the data repository; and transmit visual recommendations for the object to a device via a network, wherein the visual recommendations comprise the matching subset of images. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A mobile device comprising:
-
a camera; a display; a microphone; at least one processor; and a non-transitory computer readable storing machine readable instructions for a mobile application, wherein the at least one processor is to execute the machine readable instructions to; cause the camera to capture an image of an object; transmit, via a network interface, the image of the object to a machine learning image processing system, wherein the machine learning image processing system stores images and meta data comprised of tags describing attributes of the stored images, and wherein the tags are determined from applying the stored images to a plurality of image attribute machine learning classifiers classifying the stored images in classes for the tags; receive an initial subset of the stored images from the machine learning image processing system that are visually similar to the object, wherein the machine learning image processing system determines the initial subset of the stored images based on a comparison of attributes of the image of the object and the attributes of the stored images; display the initial subset of the stored images on the display; receive, via the microphone, speech describing supplemental user input in response to displaying the initial subset of the stored images; transmit the speech or text determined from the speech, via the network interface, to the machine learning image processing system, wherein the machine learning image processing system applies the speech or the text to a natural language processing model to determine object search criteria, and identifies a matching subset of the stored images from the object search criteria and at least one of the attributes of the image of the object and the attributes of the initial subset of the stored images; receive the matching subset of the stored images, via the network interface, from the machine learning image processing system; and display the matching subset of the stored images on the display. - View Dependent Claims (19, 20)
-
Specification