Suboptimal immediate navigational response based on long term planning
First Claim
1. A navigation system for a host vehicle, the system comprising:
- at least one processing device programmed to;
receive, from a camera, a plurality of images representative of an environment of the host vehicle;
analyze the plurality of images to identify a present navigational state associated with the host vehicle;
determine a first potential navigational action for the host vehicle based on the identified present navigational state;
determine a first indicator of an expected reward based on the first potential navigational action and the identified present navigational state;
predict a first future navigational state based on the first potential navigational action;
determine a second indicator of an expected reward associated with at least one future action determined to be available to the host vehicle in response to the first future navigational state;
determine a second potential navigational action for the host vehicle based on the identified present navigational state;
determine a third indicator of an expected reward based on the second potential navigational action and the identified present navigational state;
predict a second future navigational state based on the second potential navigational action;
determine a fourth indicator of an expected reward associated with at least one future action determined to be available to the host vehicle in response to the second future navigational state;
select the second potential navigational action based on a determination that the expected reward associated with the fourth indicator is greater than the expected reward associated with the second indicator; and
cause at least one adjustment of a navigational actuator of the host vehicle in response to the selected second potential navigational action.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods are provided for navigating an autonomous vehicle using reinforcement learning techniques. In one implementation, a navigation system for a host vehicle may include at least one processing device programmed to: receive, from a camera, a plurality of images representative of an environment of the host vehicle; analyze the plurality of images to identify a navigational state associated with the host vehicle; provide the navigational state to a trained navigational system; receive, from the trained navigational system, a desired navigational action for execution by the host vehicle in response to the identified navigational state; analyze the desired navigational action relative to one or more predefined navigational constraints; determine an actual navigational action for the host vehicle, wherein the actual navigational action includes at least one modification of the desired navigational action determined based on the one or more predefined navigational constraints; and cause at least one adjustment of a navigational actuator of the host vehicle in response to the determined actual navigational action for the host vehicle.
73 Citations
11 Claims
-
1. A navigation system for a host vehicle, the system comprising:
at least one processing device programmed to; receive, from a camera, a plurality of images representative of an environment of the host vehicle; analyze the plurality of images to identify a present navigational state associated with the host vehicle; determine a first potential navigational action for the host vehicle based on the identified present navigational state; determine a first indicator of an expected reward based on the first potential navigational action and the identified present navigational state; predict a first future navigational state based on the first potential navigational action; determine a second indicator of an expected reward associated with at least one future action determined to be available to the host vehicle in response to the first future navigational state; determine a second potential navigational action for the host vehicle based on the identified present navigational state; determine a third indicator of an expected reward based on the second potential navigational action and the identified present navigational state; predict a second future navigational state based on the second potential navigational action; determine a fourth indicator of an expected reward associated with at least one future action determined to be available to the host vehicle in response to the second future navigational state; select the second potential navigational action based on a determination that the expected reward associated with the fourth indicator is greater than the expected reward associated with the second indicator; and cause at least one adjustment of a navigational actuator of the host vehicle in response to the selected second potential navigational action. - View Dependent Claims (2, 3, 4, 5)
-
6. An autonomous vehicle, the autonomous vehicle comprising:
-
a frame; a body attached to the frame; a camera; and at least one processing device programmed to; receive, from the camera, a plurality of images representative of an environment of the autonomous vehicle; analyze the plurality of images to identify a present navigational state associated with the autonomous vehicle; determine a first potential navigational action for the autonomous vehicle based on the identified present navigational state; determine a first indicator of an expected reward based on the first potential navigational action and the identified present navigational state; predict a first future navigational state based on the first potential navigational action; determine a second indicator of an expected reward associated with at least one future action determined to be available to the autonomous vehicle in response to the first future navigational state; determine a second potential navigational action for the autonomous vehicle based on the identified present navigational state; determine a third indicator of an expected reward based on the second potential navigational action and the identified present navigational state; predict a second future navigational state based on the second potential navigational action; determine a fourth indicator of an expected reward associated with at least one future action determined to be available to the autonomous vehicle in response to the second future navigational state; select the second potential navigational action based on a determination that the expected reward associated with the fourth indicator is greater than the expected reward associated with the second indicator; and cause at least one adjustment of a navigational actuator of the autonomous vehicle in response to the selected second potential navigational action. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method for navigating an autonomous vehicle, the method comprising:
-
receiving, from a camera, a plurality of images representative of an environment of the autonomous vehicle; analyzing the plurality of images to identify a present navigational state associated with the autonomous vehicle; determining a first potential navigational action for the autonomous vehicle based on the identified present navigational state; determining a first indicator of an expected reward based on the first potential navigational action and the identified present navigational state; predicting a first future navigational state based on the first potential navigational action; determining a second indicator of an expected reward associated with at least one future action determined to be available to the autonomous vehicle in response to the first future navigational state; determining a second potential navigational action for the autonomous vehicle based on the identified present navigational state; determining a third indicator of an expected reward based on the second potential navigational action and the identified present navigational state; predicting a second future navigational state based on the second potential navigational action; determining a fourth indicator of an expected reward associated with at least one future action determined to be available to the autonomous vehicle in response to the second future navigational state; selecting the second potential navigational action based on a determination that the expected reward associated with the fourth indicator is greater than the expected reward associated with the second indicator; and causing at least one adjustment of a navigational actuator of the autonomous vehicle in response to the selected second potential navigational action.
-
Specification