Sensor based semantic object generation

US 10,685,233 B2
Filed: 10/24/2017
Issued: 06/16/2020
Est. Priority Date: 10/24/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of object recognition, the method comprising:

receiving, by a computing system comprising one or more computing devices, state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects;

generating, by the computing system, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words;

determining, by the computing system, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated with recognizing textual information in the environment and associating the textual information with a time and a location of an event; and

generating, by the computing system, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Provided are methods, systems, and devices for generating semantic objects and an output based on the detection or recognition of the state of an environment that includes objects. State data, based in part on sensor output, can be received from one or more sensors that detect a state of an environment including objects. Based in part on the state data, semantic objects are generated. The semantic objects can correspond to the objects and include a set of attributes. Based in part on the set of attributes of the semantic objects, one or more operating modes, associated with the semantic objects can be determined. Based in part on the one or more operating modes, object outputs associated with the semantic objects can be generated. The object outputs can include one or more visual indications or one or more audio indications.

Citations

20 Claims

1. A computer-implemented method of object recognition, the method comprising:
- receiving, by a computing system comprising one or more computing devices, state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects;
  
  generating, by the computing system, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words;
  
  determining, by the computing system, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated with recognizing textual information in the environment and associating the textual information with a time and a location of an event; and
  
  generating, by the computing system, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The computer-implemented method of claim 1, wherein the computing system comprises a display component configured to display one or more images comprising images of the environment including the one or more objects that are detected by the one or more sensors.
  - 3. The computer-implemented method of claim 2, wherein the one or more sensors comprise one or more periscopic cameras that are positioned to capture the one or more images including the one or more objects or portions of the one or more objects that are not within a visual plane of the display component.
  - 4. The computer-implemented method of claim 1, further comprising:
    - determining, by the computing system, based in part on the set of attributes of the one or more semantic objects, object data that matches the one or more semantic objects, wherein the object data comprises information associated with one or more related objects, one or more remote data sources, one or more locations, or one or more events; and
      
      accessing, by the computing system, one or more portions of the object data that matches the one or more semantic objects, wherein the one or more object outputs are based in part on the one or more portions of the object data that matches the one or more semantic objects.
  - 5. The computer-implemented method of claim 4, further comprising:
    - generating, by the computing system, based in part on the state data or the one or more semantic objects, one or more interface elements associated with the one or more objects, wherein the one or more interface elements comprise one or more images responsive to one or more inputs; and
      
      responsive to receiving the one or more inputs to the one or more interface elements, determining, by the computing system, one or more remote computing devices that comprise at least a portion of the object data, wherein the one or more object outputs comprise one or more remote source indications associated with the one or more remote computing devices that comprise at least a portion of the object data.
  - 6. The computer-implemented method of claim 1, wherein the one or more operating modes comprise a location recognition mode associated with recognizing one or more locations in the environment, an object recognition mode associated with recognizing the one or more objects in the environment, or an event recognition mode associated with recognizing an occurrence of one or more events in the environment.
  - 7. The computer-implemented method of claim 1, further comprising:
    - determining, by the computing system, based in part on the state data or the one or more semantic objects, the one or more objects that comprise one or more semantic symbols, wherein the one or more semantic symbols comprise one or more letters, one or more logograms, one or more syllabic characters, or one or more pictograms; and
      
      determining, by the computing system, based in part on the one or more semantic symbols, one or more words associated with the one or more semantic symbols, wherein the set of attributes of the one or more semantic objects comprises the one or more words.
  - 8. The computer-implemented method of claim 7, further comprising:
    - determining, by the computing system, a detected language associated with the one or more words; and
      
      generating, by the computing system, based in part on translation data, translated output when the detected language is not associated with a default language, the translation data comprising one or more words in the default language and one or more words in the detected language, the translated output comprising the one or more words in the default language that correspond to a portion of the one or more words in the detected language, wherein the one or more object outputs are based in part on the translated output.
  - 9. The computer-implemented method of claim 1, further comprising:
    - receiving location data comprising information associated with a current location of the environment and a destination location;
      
      determining, by the computing system, based in part on the location data and the state of the environment comprising the one or more objects within a field of view of the one or more sensors, a path from the current location to the destination location; and
      
      generating, by the computing system, one or more directions based in part on the one or more semantic objects and the path from the current location to the destination location, wherein the one or more object outputs are based in part on the one or more directions.
  - 10. The computer-implemented method of claim 1, further comprising:
    - determining, by the computing system, based in part on an extent to which each of the one or more semantic objects is associated with context data, one or more relevance values corresponding to the one or more semantic objects, the context data comprising data associated with a time of day, a current location, one or more scheduled events, one or more user locations, or one or more user preferences, wherein the one or more object outputs are based in part on the one or more relevance values that correspond to the one or more semantic objects.
  - 11. The computer-implemented method of claim 1, further comprising:
    - modifying, by the computing system, based in part on the state data or the semantic data, the one or more visual indications or the one or more audio indications, wherein the modifying comprises transforming the one or more visual indications into one or more modified audio indications, transforming the one or more audio indications into one or more modified visual indications, modifying a size of the one or more visual indications, modifying one or more color characteristics of the one or more visual indications, or modifying an amplitude of the one or more audio indications.
  - 12. The computer-implemented method of claim 1, wherein the set of attributes associated with the one or more semantic objects comprises one or more object identities, one or more object types, an object location, a monetary value, an ownership status, a stock keeping unit, or a set of physical characteristics.

13. One or more tangible, non-transitory computer-readable media storing computer-readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising:
- receiving state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects;
  
  generating, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words;
  
  determining, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated recognizing textual information in the environment and associating the textual information with a time and a location of an event; and
  
  generating, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event.
- View Dependent Claims (14, 15, 16)
- - 14. The one or more tangible, non-transitory computer-readable media of claim 13, further comprising:
    - determining, based in part on the set of attributes of the one or more semantic objects, object data that matches the one or more semantic objects, wherein the object data comprises information associated with one or more related objects, one or more remote data sources, one or more locations, or one or more events; and
      
      accessing one or more portions of the object data that matches the one or more semantic objects, wherein the one or more object outputs are based in part on the one or more portions of the object data that matches the one or more semantic objects.
  - 15. The one or more tangible, non-transitory computer-readable media of claim 14, further comprising:
    - generating, based in part on the state data or the one or more semantic objects, one or more interface elements associated with the one or more objects, wherein the one or more interface elements comprise one or more images responsive to one or more inputs; and
      
      responsive to receiving the one or more inputs to the one or more interface elements, determining, one or more remote computing devices that comprise at least a portion of the object data, wherein the one or more object outputs comprise one or more remote source indications associated with the one or more remote computing devices that comprise at least a portion of the object data.
  - 16. The one or more tangible, non-transitory computer-readable media of claim 13, further comprising:
    - modifying, based in part on the state data or the semantic data, the one or more visual indications or the one or more audio indications, wherein the modifying comprises transforming the one or more visual indications into one or more modified audio indications, transforming the one or more audio indications into one or more modified visual indications, modifying a size of the one or more visual indications, or modifying an amplitude of the one or more audio indications.

17. A computing system comprising:
- one or more processors;
  
  one or more non-transitory computer-readable media storing instructions that when executed by the one or more processors cause the one or more processors to perform operations comprising;
  
  receiving state data based in part on sensor output from one or more sensors that detect a state of an environment including one or more objects;
  
  generating, based in part on the state data and a machine-learned model, one or more semantic objects corresponding to the one or more objects, wherein the machine-learned model is configured to recognize the one or more objects, and wherein the one or more semantic objects comprise a set of attributes associated with one or more words;
  
  determining, based in part on the set of attributes associated with the one or more words, one or more operating modes comprising a text recognition mode associated recognizing textual information in the environment and associating the textual information with a time and a location of an event; and
  
  generating, based in part on the one or more operating modes comprising the text recognition mode, one or more object outputs associated with the one or more semantic objects, wherein the one or more object outputs comprise one or more visual indications or one or more audio indications associated with the time and the location of the event.
- View Dependent Claims (18, 19, 20)
- - 18. The computing system of claim 17, further comprising:
    - determining, based in part on the set of attributes of the one or more semantic objects, object data that matches the one or more semantic objects, wherein the object data comprises information associated with one or more related objects, one or more remote data sources, one or more locations, or one or more events; and
      
      accessing one or more portions of the object data that matches the one or more semantic objects, wherein the one or more object outputs are based in part on the one or more portions of the object data that matches the one or more semantic objects.
  - 19. The computing system of claim 18, further comprising:
    - generating, based in part on the state data or the one or more semantic objects, one or more interface elements associated with the one or more objects, wherein the one or more interface elements comprise one or more images responsive to one or more inputs; and
      
      responsive to receiving the one or more inputs to the one or more interface elements, determining, one or more remote computing devices that comprise at least a portion of the object data, wherein the one or more object outputs comprise one or more remote source indications associated with the one or more remote computing devices that comprise at least a portion of the object data.
  - 20. The computing system of claim 17, further comprising:
    - modifying, based in part on the state data or the semantic data, the one or more visual indications or the one or more audio indications, wherein the modifying comprises transforming the one or more visual indications into one or more modified audio indications, transforming the one or more audio indications into one or more modified visual indications, modifying a size of the one or more visual indications, or modifying an amplitude of the one or more audio indications.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Wantland, Tim, Barnett, Donald A., Jones, David Matthew
Primary Examiner(s)
Bernardi, Brenda C

Application Number

US15/792,393
Publication Number

US 20190122046A1
Time in Patent Office

966 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/7837   using objects detected or r...

G06F 18/214   Generating training pattern...

G06F 40/30   Semantic analysis

G06F 40/42   Data-driven translation

G06N 20/00   Machine learning

G06V 20/20   in augmented reality scenes

G06V 30/274   Syntactic or semantic conte...

Sensor based semantic object generation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Sensor based semantic object generation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links