Autonomous reinforcement learning method of receiver scan schedule control

US 10,523,342 B1
Filed: 03/12/2019
Issued: 12/31/2019
Est. Priority Date: 03/12/2019
Status: Active Grant

First Claim

Patent Images

1. A scanning system configured to detect signals within a spectrum of interest, the system comprising:

an electromagnetic (“

EM”

) signal receiver;

a controller configured to determine and implement a scan schedule that dictates a pattern of frequencies to which the EM signal receiver is tuned and a timing thereof; and

an agent instantiated in the controller;

the controller being configured to cause the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, and then perform a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created by the agent during the initial and subsequent scans;

the agent being configured to create each of the updated scan schedules by determining and applying a schedule update to a preceding scan schedule, said schedule update being determined according to an application of reinforcement learning by the agent to EM signal data arising from detections by the signal receiver of the signals of interest;

said reinforcement learning including;

estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes;

awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule;

according to said reinforcement, determining by the agent of the schedule update;

applying the schedule update to the preceding scan schedule to create the updated scan schedule; and

determining an expected degree of success for the updated scan schedule.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of detecting electromagnetic signal sources of interest includes applying reinforcement learning to automatically and continuously update a receiver scan schedule wherein an agent is reinforced according to comparisons between expected and actual degrees of success after each schedule update, actual degrees of success being estimated by applying to signal data a plurality of value scales applicable to a plurality of reward classes. An exponential scale can be applied across the plurality of reward classes. A companion system can provide data analysis to the agent. The agent can include an actor module that determines schedule updates and a critic module that determines the degrees of scanning success and awards the reinforcements. Embodiments implement a plurality of agents according to asynchronous multiple-worker actor/critic reinforcement learning. The method can be initially applied to training data comprising synthetic and/or previously measured signal data for which the signal sources are fully characterized.

Citations

24 Claims

1. A scanning system configured to detect signals within a spectrum of interest, the system comprising:
- an electromagnetic (“
  
  EM”
  
  ) signal receiver;
  
  a controller configured to determine and implement a scan schedule that dictates a pattern of frequencies to which the EM signal receiver is tuned and a timing thereof; and
  
  an agent instantiated in the controller;
  
  the controller being configured to cause the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, and then perform a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created by the agent during the initial and subsequent scans;
  
  the agent being configured to create each of the updated scan schedules by determining and applying a schedule update to a preceding scan schedule, said schedule update being determined according to an application of reinforcement learning by the agent to EM signal data arising from detections by the signal receiver of the signals of interest;
  
  said reinforcement learning including;
  
  estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes;
  
  awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule;
  
  according to said reinforcement, determining by the agent of the schedule update;
  
  applying the schedule update to the preceding scan schedule to create the updated scan schedule; and
  
  determining an expected degree of success for the updated scan schedule.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The system of claim 1, wherein the agent is configured to apply an exponential scale to the value scales across the plurality of reward classes.
  - 3. The system of claim 1, wherein the reinforcements include negative rewards applied to the agent when a specified degree of scanning success is not achieved within a specified number of scans or within a specified time period.
  - 4. The system of claim 1, wherein the spectrum of interest is divided into a plurality of frequency “
    - channels,”
      
      each of which is narrower than a bandwidth of the signal receiver, and wherein the initial and updated scan schedules dictate patterns of retuning of the EM signal receiver to center frequencies of the channels and a timing thereof.
  - 5. The system of claim 1, wherein the agent comprises an actor module that is configured to determine the scan schedule updates and apply the scan schedule updates to the preceding scan schedules, the agent further comprising a critic module that is configured to estimate the actual degrees of scanning success after the scans and to award the reinforcements to the actor module.
  - 6. The system of claim 1, wherein the controller is configured to determine the initial scan schedule by applying reinforcement learning during a training session to a training EM data set comprising at least one of synthetic signal data and previously obtained EM signal data.
  - 7. The system of claim 6, wherein the training EM data set corresponds to a known set of actual and/or theoretical signal sources that have known characteristics.
  - 8. The system of claim 7, wherein the agent is configured to apply negative reinforcement during the training session whenever a scan of the training EM data fails to detect an EM signal of interest that is known to be present in the training EM data set.
  - 9. The system of claim 1, further comprising a companion system that is configured to provide EM signal data companion analysis to the agent.
  - 10. The system of claim 9, wherein the companion analysis includes at least one of distinguishing types of EM sources and determining their relative degrees of interest.
  - 11. The system of claim 1, wherein:
    - the agent comprises a global agent and a plurality of worker agents;
      
      each of the worker agents is configured to;
      
      independently apply reinforcement learning to the EM signal data according to a data analysis strategy;
      
      derive therefrom a scan schedule gradient; and
      
      provide the scan schedule gradient to the global agent; and
      
      the global agent is configured to create the updated scan schedules by applying the scan schedule gradients received from the worker agents.

12. A method of automatically determining and implementing updates to a scan schedule that dictates a pattern of frequencies to which an electromagnetic (“
- EM”
  
  ) signal receiver is tuned and a timing thereof, the method comprising;
  
  performing by the signal receiver of at least one initial scan of a spectrum of interest according to an initial scan schedule implemented by a controller of the signal receiver; and
  
  performing by the signal receiver of a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created during the initial and subsequent scans by an agent instantiated in the controller,the agent creating each of the updated scan schedules by applying a schedule update to a preceding scan schedule, said schedule update being determined according to application by the agent of reinforcement learning to EM signal data arising from detections by the signal receiver of the signals of interest;
  
  said reinforcement learning including;
  
  estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes;
  
  awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule;
  
  according to said reinforcement, determining the schedule update;
  
  applying the schedule update to the preceding scan schedule to create the updated scan schedule; and
  
  determining an expected degree of success for the updated scan schedule.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 13. The method of claim 12, wherein an exponential scale is applied across the value scales of the plurality of reward classes.
  - 14. The method of claim 12, wherein the reinforcements include negative rewards applied to the agent when a specified degree of scanning success is not achieved within a specified number of scans or within a specified time period.
  - 15. The method of claim 12, wherein the spectrum of interest is divided into a plurality of frequency “
    - channels,”
      
      each of which is narrower than a bandwidth of the signal receiver, and wherein the initial and updated scan schedules dictate patterns of retuning of the EM signal receiver to center frequencies of the channels and a timing thereof.
  - 16. The method of claim 12, wherein the agent comprises an actor module that determines the scan schedule updates and applies the scan schedule updates to the preceding scan schedules, the agent further comprising a critic module that estimates the actual degrees of scanning success after the scans and awards the reinforcements to the actor module.
  - 17. The method of claim 12, wherein the initial scan schedule is determined by the controller by applying reinforcement learning during a training session to a training EM data set comprising at least one of synthetic signal data and previously obtained EM signal data.
  - 18. The method of claim 17, wherein the training EM data set corresponds to a known set of actual and/or theoretical signal sources that have known characteristics.
  - 19. The method of claim 18, wherein negative reinforcement is applied to the agent during the training session whenever a scan of the training EM data fails to detect an EM signal of interest that is known to be present in the training EM data set.
  - 20. The method of claim 12, wherein a goal of the scan schedule includes at least one of:
    - detecting all EM sources of interest that are within range of the EM signal receiver and are transmitting signals of interest within the spectrum of interest;
      
      detecting all EM sources of interest that are within range of the EM signal receiver and that transmit at least a minimum number of events during a specified time interval within the spectrum of interest;
      
      receiving a specified percentage of signals of interest transmitted by at least one EM source of interest; and
      
      detecting all EM sources of interest that are within range of the EM signal receiver and are transmitting signals of interest and then repeating detection of the EM sources within specified time ranges.
  - 21. The method of claim 12, further comprising providing by a companion system to the agent of EM signal data companion analysis.
  - 22. The method of claim 21, wherein the companion analysis includes at least one of distinguishing types of EM sources and determining their relative degrees of interest.
  - 23. The method of claim 12, wherein:
    - the agent comprises a global agent and a plurality of worker agents;
      
      the method includes each of the worker agents;
      
      independently applying reinforcement learning to the EM signal data according to a data analysis strategy;
      
      deriving therefrom a scan schedule gradient; and
      
      providing the scan schedule gradient to the global agent; and
      
      creating the scan schedule updates includes applying by the global agent of the scan schedule gradients received from the worker agents.

24. Non-transitory computer readable media comprising:
- software recorded thereupon that when executed by a controller configures the controller to;
  
  cause an electromagnetic (“
  
  EM”
  
  ) signal receiver to scan a spectrum of interest according to a scan schedule that dictates a pattern of frequencies to which the signal receiver is tuned and a timing thereof; and
  
  implement updates to the scan schedule by causing the signal receiver to perform at least one initial scan of the spectrum of interest according to an initial scan schedule, followed by a plurality of subsequent scans of the spectrum of interest according to a series of periodically updated scan schedules that are automatically created during the initial and subsequent scans by an agent that is instantiated in the controller,wherein the executed software configures the agent to create each of the updated scan schedules by applying a schedule update to a preceding scan schedule, said schedule update being determined according to application by the agent of reinforcement learning to EM signal data arising from detections by the signal receiver of the signals of interest;
  
  said reinforcement learning including;
  
  estimating an actual degree of scanning success applicable to the preceding scan schedule by applying to the EM signal data a plurality of value scales applicable to a corresponding plurality of reward classes;
  
  awarding at least one reinforcement to the agent according to a comparison between the actual degree of scanning success and a previously determined expected degree of scanning success applicable to the preceding scan schedule;
  
  according to said reinforcement, determining the schedule update;
  
  applying the schedule update to the preceding scan schedule to create the updated scan schedule; and
  
  determining an expected degree of success for the updated scan schedule.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
BAE Systems Information and Electronic Systems Integration Incorporated (BAE Systems Plc)
Original Assignee
BAE Systems Information and Electronic Systems Integration Incorporated (BAE Systems Plc)
Inventors
Kuzdeba, Scott A, Sussman-Fort, Jonathan M.
Primary Examiner(s)
Perez, James M

Application Number

US16/351,037
Time in Patent Office

294 Days
Field of Search
US Class Current
CPC Class Codes

G01S 7/021   Auxiliary means for detecti...

G01S 7/4817   relating to scanning

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

H04B 17/0085   using test signal generators

H04B 17/11   for calibration

H04B 17/21   for calibration; for correc...

H04B 17/27   for locating or positioning...

H04W 24/02   Arrangements for optimising...

H04W 24/08   Testing, supervising or mon...

H04W 48/16   Discovering, processing acc...

Autonomous reinforcement learning method of receiver scan schedule control

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Autonomous reinforcement learning method of receiver scan schedule control

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links