AUTOMATED CONTROL AND PARALLEL LEARNING HVAC APPARATUSES, METHODS AND SYSTEMS
First Claim
Patent Images
1. A processor-implemented method, comprising:
- establishing a connection between a first thermostat and a thermostat network, the thermostat network including a plurality of thermostats;
initializing the first thermostat;
receiving sensor data associated with the first thermostat;
receiving user feedback associated with the first thermostat;
receiving connected unit aggregated learning probability distributions;
selecting an HVAC action from a predefined set of available HVAC actions from a transition policy based on internal and external transition probabilities;
entering a new state following implementation of the selected HVAC action;
receiving a reward in the form of a positive or negative scalar associated with the selected HVAC action and new state;
updating an internal probability distribution;
determining divergence between internal probability distribution and external probability distributions; and
transmitting the internal probability distribution to the thermostat network if divergence is above a divergence threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
The AUTOMATED CONTROL AND PARALLEL LEARNING HVAC APPARATUSES, METHODS AND SYSTEMS (“ACPLHVAC”) updates real time value function estimates through parallel and reinforcement learning, via ACPLHVAC components, by observing a defined state action space to maximize user Quality of Experience (QoE) and minimize associated energy required with regulating environmental spaces.
37 Citations
6 Claims
-
1. A processor-implemented method, comprising:
-
establishing a connection between a first thermostat and a thermostat network, the thermostat network including a plurality of thermostats; initializing the first thermostat; receiving sensor data associated with the first thermostat; receiving user feedback associated with the first thermostat; receiving connected unit aggregated learning probability distributions; selecting an HVAC action from a predefined set of available HVAC actions from a transition policy based on internal and external transition probabilities; entering a new state following implementation of the selected HVAC action; receiving a reward in the form of a positive or negative scalar associated with the selected HVAC action and new state; updating an internal probability distribution; determining divergence between internal probability distribution and external probability distributions; and transmitting the internal probability distribution to the thermostat network if divergence is above a divergence threshold. - View Dependent Claims (2, 3, 4)
-
-
5. A processor-implemented method, comprising:
- initializing a thermostat to arbitrary or pre-loaded state values;
receiving multiple user feedback in the form of scalars denoting comfort;
aggregating user feedback into a scalar denoting mean user comfort;
selecting an HVAC action from a predefined set of available HVAC actions based on a transition policy informed by internal and external transition probabilities;
entering the next state following the selected HVAC action;
receiving a reward in the form of a positive or negative scalar associated with the selected HVAC action and new state as it approaches mean user comfort levels at minimized required energy; and
updating and exporting probability distributions.
- initializing a thermostat to arbitrary or pre-loaded state values;
-
6. A non-transitory computer readable medium having computer readable instructions stored thereon that, when executed by a processor of a computing device, cause the computing device to:
-
initialize to arbitrary or pre-loaded state values; receive multi-user feedback in the form of scalars denoting comfort; aggregate user feedback into a scalar denoting mean user comfort; select an HVAC action from a predefined set of available HVAC actions based on a transition policy informed by internal and external transition probabilities; enter the next state following the selected HVAC action; receiving a positive or negative scalar associated with the selected HVAC action and new state to meet at minimized energy cost without exceeding threshold user comfort levels.
-
Specification