Real-time deployment of machine learning systems
First Claim
1. A system comprising one or more processors, and a non-transitory computer-readable medium including one or more sequences of instructions that, when executed by the one or more processors, cause the system to perform operations comprising:
- receiving video data, the video data having been obtained from a video image capture device;
converting the received video data into multiple video frames encoded into a particular color space format;
rendering a first display output depicting imagery from the multiple encoded video frames;
performing an inference on the multiple video frames using a machine learning network;
determining the occurrence of one or more objects in the multiple encoded video frames based on the performed inference on the multiple video frames;
in response to determining the occurrence of one or more objects, generating for a determined object, coordinates describing a bounding perimeter about the determined object;
rendering a second display output depicting graphical information in a form corresponding to the coordinates of the bounding perimeter for the determined one or more objects from the multiple encoded video frames; and
generating a composite display output, wherein the composite display output includes the imagery of the first display output overlaid with the graphical information of the second display output.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for real-time deployment of machine learning systems. One of the operations is performed by the system receiving video data from a video image capturing device. The received video data is converted into multiple video frames. These video frames are encoded into a particular color space format. The system renders a first display output depicting imagery from the multiple encoded video frames. The system performs an inference on the video frames using a machine learning network to determine the occurrence of one or more objects in the video frames. The system renders a second display output depicting graphical information corresponding to the determined one or more objects from the multiple encoded video frames. The system then generates a composite display output including the imagery of the first display output overlaid with the graphical information of the second display output.
17 Citations
21 Claims
-
1. A system comprising one or more processors, and a non-transitory computer-readable medium including one or more sequences of instructions that, when executed by the one or more processors, cause the system to perform operations comprising:
-
receiving video data, the video data having been obtained from a video image capture device; converting the received video data into multiple video frames encoded into a particular color space format; rendering a first display output depicting imagery from the multiple encoded video frames; performing an inference on the multiple video frames using a machine learning network; determining the occurrence of one or more objects in the multiple encoded video frames based on the performed inference on the multiple video frames; in response to determining the occurrence of one or more objects, generating for a determined object, coordinates describing a bounding perimeter about the determined object; rendering a second display output depicting graphical information in a form corresponding to the coordinates of the bounding perimeter for the determined one or more objects from the multiple encoded video frames; and generating a composite display output, wherein the composite display output includes the imagery of the first display output overlaid with the graphical information of the second display output. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method implemented by a system comprising of one or more processors, the method comprising:
-
receiving video data, the video data having been obtained from a video image capture device; converting the received video data into multiple video frames encoded into a particular color space format; rendering a first display output depicting imagery from the multiple encoded video frames; performing an inference on the multiple video frames using a machine learning network; determining the occurrence of one or more objects in the multiple encoded video frames based on the performed inference on the multiple video frames; in response to determining the occurrence of one or more objects, generating for a determined object, coordinates describing a bounding perimeter about the determined object; rendering a second display output depicting graphical information in a form corresponding to the coordinates of the bounding perimeter for the determined one or more objects from the multiple encoded video frames; and generating a composite display output, wherein the composite display output includes the imagery of the first display output overlaid with the graphical information of the second display output. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer storage medium comprising instructions that when executed by a system comprising one or more processors, cause the one or more processors to perform operations comprising:
-
receiving video data, the video data having been obtained from a video image capture device; converting the received video data into multiple video frames encoded into a particular color space format; rendering a first display output depicting imagery from the multiple encoded video frames; performing an inference on the multiple video frames using a machine learning network; determining the occurrence of one or more objects in the multiple encoded video frames based on the performed inference on the multiple video frames; in response to determining the occurrence of one or more objects, generating for a determined object, coordinates describing a bounding perimeter about the determined object; rendering a second display output depicting graphical information in a form corresponding to the coordinates of the bounding perimeter for the determined one or more objects from the multiple encoded video frames; and generating a composite display output, wherein the composite display output includes the imagery of the first display output overlaid with the graphical information of the second display output. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification