Autonomous reinforcement learning method of receiver scan schedule control
First Claim
1. A scanning system configured to detect signals within a spectrum of interest, the system comprising:
- an electromagnetic (“
EM”
) signal receiver;
a controller configured to determine and implement a scan schedule that dictates a pattern of frequencies to which the EM signal receiver is tuned and a timing thereof; and
an agent instantiated in the controller;
the controller being configured to cause the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, and then perform a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created by the agent during the initial and subsequent scans;
the agent being configured to create each of the updated scan schedules by determining and applying a schedule update to a preceding scan schedule, said schedule update being determined according to an application of reinforcement learning by the agent to EM signal data arising from detections by the signal receiver of the signals of interest;
said reinforcement learning including;
estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes;
awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule;
according to said reinforcement, determining by the agent of the schedule update;
applying the schedule update to the preceding scan schedule to create the updated scan schedule; and
determining an expected degree of success for the updated scan schedule.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of detecting electromagnetic signal sources of interest includes applying reinforcement learning to automatically and continuously update a receiver scan schedule wherein an agent is reinforced according to comparisons between expected and actual degrees of success after each schedule update, actual degrees of success being estimated by applying to signal data a plurality of value scales applicable to a plurality of reward classes. An exponential scale can be applied across the plurality of reward classes. A companion system can provide data analysis to the agent. The agent can include an actor module that determines schedule updates and a critic module that determines the degrees of scanning success and awards the reinforcements. Embodiments implement a plurality of agents according to asynchronous multiple-worker actor/critic reinforcement learning. The method can be initially applied to training data comprising synthetic and/or previously measured signal data for which the signal sources are fully characterized.
-
Citations
24 Claims
-
1. A scanning system configured to detect signals within a spectrum of interest, the system comprising:
-
an electromagnetic (“
EM”
) signal receiver;a controller configured to determine and implement a scan schedule that dictates a pattern of frequencies to which the EM signal receiver is tuned and a timing thereof; and an agent instantiated in the controller; the controller being configured to cause the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, and then perform a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created by the agent during the initial and subsequent scans; the agent being configured to create each of the updated scan schedules by determining and applying a schedule update to a preceding scan schedule, said schedule update being determined according to an application of reinforcement learning by the agent to EM signal data arising from detections by the signal receiver of the signals of interest; said reinforcement learning including; estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes; awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule; according to said reinforcement, determining by the agent of the schedule update; applying the schedule update to the preceding scan schedule to create the updated scan schedule; and determining an expected degree of success for the updated scan schedule. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of automatically determining and implementing updates to a scan schedule that dictates a pattern of frequencies to which an electromagnetic (“
- EM”
) signal receiver is tuned and a timing thereof, the method comprising;performing by the signal receiver of at least one initial scan of a spectrum of interest according to an initial scan schedule implemented by a controller of the signal receiver; and performing by the signal receiver of a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created during the initial and subsequent scans by an agent instantiated in the controller, the agent creating each of the updated scan schedules by applying a schedule update to a preceding scan schedule, said schedule update being determined according to application by the agent of reinforcement learning to EM signal data arising from detections by the signal receiver of the signals of interest; said reinforcement learning including; estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes; awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule; according to said reinforcement, determining the schedule update; applying the schedule update to the preceding scan schedule to create the updated scan schedule; and determining an expected degree of success for the updated scan schedule. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- EM”
-
24. Non-transitory computer readable media comprising:
-
software recorded thereupon that when executed by a controller configures the controller to; cause an electromagnetic (“
EM”
) signal receiver to scan a spectrum of interest according to a scan schedule that dictates a pattern of frequencies to which the signal receiver is tuned and a timing thereof; andimplement updates to the scan schedule by causing the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, followed by a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created during the initial and subsequent scans by an agent that is instantiated in the controller, wherein the executed software configures the agent to create each of the updated scan schedules by applying a schedule update to a preceding scan schedule, said schedule update being determined according to application by the agent of reinforcement learning to EM signal data arising from detections by the signal receiver of the signals of interest; said reinforcement learning including; estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes; awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule; according to said reinforcement, determining the schedule update; applying the schedule update to the preceding scan schedule to create the updated scan schedule; and determining an expected degree of success for the updated scan schedule.
-
Specification