Exploration of Competitive and Cooperative Strategies of Reinforcement Learning Algorithms in Multi-Agent Systems
As the development of artificial intelligence becomes more and more mature, many machine learning algorithms have been proposed and successfully solved problems in various fields. However, which algorithm is suitable in which environment is still an issue that many users are concerned about. Therefore, in this paper, we explore the effectiveness of different types of reinforcement learning architectures in multi-agent systems in different environments. First, we discuss the basic theory of reinforcement learning, and then we explore the algorithms extended by the reinforcement learning architecture, including Value-based Hysteretic Q Learning and CPM and Policy-based WoLF PHC. After understanding the core concepts of each algorithm, we conduct experiments on a competitive task: Matching Pennies and two cooperative tasks: Ball Balance and Pursuit Domain, and discuss the possible reasons for the results based on the experimental data. Finally, conclusions were drawn based on the discussion results.