Application of multi-agent deep reinforcement learning method to solve the dynamic weapon target assignment problem
Abstract
This paper presents the Multi-Agent Deep Reinforcement Learning method to solve the dynamic weapon target assignment (DTWA) in the air defense command and control system. The weapon model is built based on predicting the optimal trajectory of air targets and the status of objects on the ground, as well as the optimal plan to coordinate the activities of weapons in the system. The weapon model is built on the OpenAI Gym library, describes the rules of the dynamic air defense combat environment and uses deep reinforcement learning algorithms (Deep Q-Leanring) to optimize the policy. Experimental simulation results with different air defense scenarios demonstrate that, after being trained, the deep reinforcement learning model of the air defense weapon has the ability to automatically analyze, perceive situations, and coordinate with other air defense weapons in the system, build a dynamic resistance interaction plan and select the optimal plan taking into account practical constraints so that the overall loss function has a minimum value for the entire combat process. Therefore, the reinforcement learning model has the ability to be applied to develop software modules to support decision-making in the air defense command and control system.