The threat of North Korea's long-range firepower is recognized as a typical asymmetric threat, and South Korea is prioritizing the development of a Korean-style missile defense system to defend against it. To address this, previous research modeled North Korean long-range artillery attacks as a Markov Decision Process (MDP) and used Approximate Dynamic Programming as an algorithm for missile defense, but due to its limitations, there is an intention to apply deep reinforcement learning techniques that incorporate deep learning. In this paper, we aim to develop a missile defense system algorithm by applying a modified DQN with multi-agent-based deep reinforcement learning techniques. Through this, we have researched to ensure an efficient missile defense system can be implemented considering the style of attacks in recent wars, such as how effectively it can respond to enemy missile attacks, and have proven that the results learned through deep reinforcement learning show superior outcomes.
다중 에이전트 강화학습의 발전과 함께 게임 분야에서 강화학습을 레벨 디자인에 적용하려는 연구가 계속되 고 있다. 플랫폼의 형태가 레벨 디자인의 중요한 요소임에도 불구하고 지금까지의 연구들은 플레이어의 스킬 수준이나, 스킬 구성 등 플레이어의 매트릭에 초첨을 맞춰 강화학습을 활용하였다. 따라서 본 논문에서는 레 벨 디자인에 플랫폼의 형태가 사용될 수 있도록 시각 센서의 가시성과 구조물의 복잡성을 고려하여 플랫폼 이 플레이 경험에 미치는 영향을 연구한다. 이를 위해Unity ML-Agents Toolkit과MA-POCA 알고리즘, Self-play 방식을 기반으로2vs2 대전 슈팅 게임 환경을 개발하였으며 다양한 플랫폼의 형태를 구성하였다. 분석을 통해 플랫폼의 형태에 따른 가시성과 복잡성의 차이가 승률 밸런스에는 크게 영향을 미치지 않으나 전체 에피소 드 수, 무승부 비율, Elo의 증가폭에 유의미한 영향을 미치는 것을 확인했다.
Reinforcement learning (RL) is widely applied to various engineering fields. Especially, RL has shown successful performance for control problems, such as vehicles, robotics, and active structural control system. However, little research on application of RL to optimal structural design has conducted to date. In this study, the possibility of application of RL to structural design of reinforced concrete (RC) beam was investigated. The example of RC beam structural design problem introduced in previous study was used for comparative study. Deep q-network (DQN) is a famous RL algorithm presenting good performance in the discrete action space and thus it was used in this study. The action of DQN agent is required to represent design variables of RC beam. However, the number of design variables of RC beam is too many to represent by the action of conventional DQN. To solve this problem, multi-agent DQN was used in this study. For more effective reinforcement learning process, DDQN (Double Q-Learning) that is an advanced version of a conventional DQN was employed. The multi-agent of DDQN was trained for optimal structural design of RC beam to satisfy American Concrete Institute (318) without any hand-labeled dataset. Five agents of DDQN provides actions for beam with, beam depth, main rebar size, number of main rebar, and shear stirrup size, respectively. Five agents of DDQN were trained for 10,000 episodes and the performance of the multi-agent of DDQN was evaluated with 100 test design cases. This study shows that the multi-agent DDQN algorithm can provide successfully structural design results of RC beam.
PURPOSES : The purpose of this study is to check the possibilities of traffic pattern analysis using MatSIM for urban road network operation in incident case. METHODS : One of the stochastic dynamic models is MatSIM. MatSIM is a transportation simulation tool based on stochastic dynamic model and activity based model. It is an open source software developed by IVT, ETH zurich, Switzerland. In MatSIM, various scenario comparison analyses are possible and analyses results are expressed using the visualizer which shows individual vehicle movements and traffic patterns. In this study, trip distribution in 24-hour, traffic volume, and travel speed using MatSIM are similar to those of measured values. Therefore, results of MatSIM are reasonable comparing with measured values. Traffic patterns are changed according to incident from change of individual behavior. RESULTS : The simulation results and the actual measured values are similar. The simulation results show reasonable ranges which can be used for traffic pattern analysis. CONCLUSIONS : The change of traffic pattern including trip distribution, traffic volumes and speeds according to various incident scenarios can be used for traffic control policy decision to provide effective operation of urban road network.
In this paper, we present a finite-time sliding mode control (FSMC) with an integral finitetime sliding surface for applying the concept of graph theory to a distributed wheeled mobile robot (WMR) system. The kinematic and dynamic property of the WMR system are considered simultaneously to design a finite-time sliding mode controller. Next, consensus and formation control laws for distributed WMR systems are derived by using the graph theory. The kinematic and dynamic controllers are applied simultaneously to compensate the dynamic effect of the WMR system. Compared to the conventional sliding mode control (SMC), fast convergence is assured and the finite-time performance index is derived using extended Lyapunov function with adaptive law to describe the uncertainty. Numerical simulation results of formation control for WMR systems shows the efficacy of the proposed controller.
게임과 같은 실시간이며 복잡한 다중 에이전트 환경에서는 시스템의 효율성을 극대화하기 위해 반복적으로 작업 할당이 수행된다. 본 논문에서는 실시간 다중 에이전트 구조에 적합하며 최적화된 작업 할당이 가능한 방안으로 A* 알고리즘을 적용한 조정 에이전트를 제안한다. 제안하는 조정 에이전트는 수행 가능한 에이전트와 할당 가능한 작업으로 정제된 모든 에이전트와 작업의 조합으로 상태 그래프를 생성하고, A* 알고리즘을 이용한 평가함수를 적용하여 최적화된 작업 할당을 수행한다. 또한 실시간 재 할당에 따른 지연을 방지하기 위해 그리디 방식을 선택적으로 사용함으로써 재할당 요구에 대한 빠른 처리가 가능하다. 마지막으로 모의실험을 통해 조정 에이전트를 통한 최적화된 작업 할당 결과가 그리디 방식의 작업 할당보다 성능이 25%향상되었음을 입증한다.
Effective tools which can alleviate the complexity and computational load problem in collision-free motion planning for multi-agent system have steadily been demanded in robotics field. To reduce the complexity, the extended collision map (ECM) which adopts decoupled approach and prioritization is already proposed. In ECM, the collision regions which represent the potential collision of robots are calculated using the computational power; the complexity problem is not resolved completely. In this paper, we propose a mathematical analysis of the extended collision map; as a result, we formulate the collision region as an equation with 5–8 variables. For mathematical analysis, we introduce realistic assumptions as follows; the paths of robots can be approximated to a straight line or an arc and the robots move with uniform velocity or constant acceleration near the intersection between paths. Our result reduces the computational complexity in comparison with the previous result without losing optimality, because we use simple but exact equations of the collision regions. This result can be widely applicable to coordinated multi-agent motion planning.
It is well known that mathematical solutions for multi-agent planning problems are very difficult to obtain due to the complexity of mutual interactions among multi-agent. Most of the past research results thus are based on the probabilistic completeness. However, the practicality and effectiveness of the solution from the probabilistic completeness is significantly reduced by heavy computational burden. In this paper, we propose a practically applicable solution technique for multi-agent planning problems, which assures a reasonable computation time and a real world application for more than 3 multi-agents for the case of general shaped paths in agent movement. First, to reduce the computation time, a collision map is utilized for detecting potential collisions and obtaining collision-free solutions for multi-agents. Second, to minimize the maximum of multi-agent task execution time, a method is developed for selecting an optimal priority order. Simulations are finally provided for more than 20 agents to emphasize the effectiveness of the proposed interactive approach to multi-agent planning problems.
세계무역기구(WTO : World Trade Organization)를 설립된 이후 무역은 세계화가 되고, WTO에서 무역 장벽을 낮춰 국가 간의 경제 교류가 점점 증가하면서 국제적인 물류 시스템이 필요하게 되었다. 원가를 절감하기 위해 대랑 수송 수판으로 컨테이너선을 이용하면서 대형 컨테이너 선사들은 국제적인 물류 시스템의 대안으로 기업에게 화물추적 정보시스템의 제공이나 장비, 기기 관리를 위한 정보시스템 네트워크를 구축하여 자동화 시스템을 도입했다. 컨테이너 터미널 자동화를 위해 본 논문에서는 수시로 변경되는 정보를 인식하여 에이전트간의 정보교환을 위해 유동적으로 대처할 수 있는 XML(eXtensive Markup Language)과 JMS(Java Message Service)를 이용한 멀티에이전트간의 통신모델을 제안했다. 이 논문은 기존의 자동화한 컨테이너 터미널 시스템 사례와 자동화 시스템을 개발하는데 어려움, 컨테이너 터미널 시스템이 요구하는 통신과 자동화에 대하여 분석하였다.