Sensor based semantic object generation
First Claim
1. A computer-implemented method of object recognition, the method comprising:
- receiving, by a computing system comprising one or more computing devices, state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects;
generating, by the computing system, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words;
determining, by the computing system, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated with recognizing textual information in the environment and associating the textual information with a time and a location of an event; and
generating, by the computing system, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are methods, systems, and devices for generating semantic objects and an output based on the detection or recognition of the state of an environment that includes objects. State data, based in part on sensor output, can be received from one or more sensors that detect a state of an environment including objects. Based in part on the state data, semantic objects are generated. The semantic objects can correspond to the objects and include a set of attributes. Based in part on the set of attributes of the semantic objects, one or more operating modes, associated with the semantic objects can be determined. Based in part on the one or more operating modes, object outputs associated with the semantic objects can be generated. The object outputs can include one or more visual indications or one or more audio indications.
-
Citations
20 Claims
-
1. A computer-implemented method of object recognition, the method comprising:
-
receiving, by a computing system comprising one or more computing devices, state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects; generating, by the computing system, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words; determining, by the computing system, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated with recognizing textual information in the environment and associating the textual information with a time and a location of an event; and generating, by the computing system, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. One or more tangible, non-transitory computer-readable media storing computer-readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising:
-
receiving state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects; generating, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words; determining, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated recognizing textual information in the environment and associating the textual information with a time and a location of an event; and generating, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event. - View Dependent Claims (14, 15, 16)
-
-
17. A computing system comprising:
-
one or more processors; one or more non-transitory computer-readable media storing instructions that when executed by the one or more processors cause the one or more processors to perform operations comprising; receiving state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects; generating, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words; determining, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated recognizing textual information in the environment and associating the textual information with a time and a location of an event; and generating, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event. - View Dependent Claims (18, 19, 20)
-
Specification