검색결과 - koreascholar

1.

2025.04 구독 인증기관·개인회원 무료

Probabilistic Reinforcement Learning Framework for Predicting River Habitat Degradation and Optimizing Adaptive Management Strategies

Sung-Ho Lim, Ji Won Woo, Yuno Do

한국응용곤충학회 학술대회논문집 2025년 한국곤충학회, 한국응용곤충학회 공동 춘계학술대회 p.88 한국응용곤충학회

2.

2024.12 구독 인증기관·개인회원 무료

Advancing Maritime Route Optimization: Using Reinforcement Learning for Ensuring Safety and Fuel Efficiency

Jisoo Kim, Byeonggong Hwang, Gi-Hyun Kim, Ung-Gyu Kim

International Journal of e-Navigation and Maritime Economy Volume 23 pp.413-51 국제이네비해양경제학회

Efficient and safe maritime navigation in complex and congested coastal regions requires advanced route optimization methods that surpass the limitations of traditional shortest-path algorithms. This study applies Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) reinforcement learning (RL) algorithms to generate and refine optimal ship routes in East Asian waters, focusing on passages from Shanghai to Busan and Ulsan to Daesan. Operating within a grid-based representation of the marine environment and considering constraints such as restricted areas and Traffic Separation Schemes (TSS), both DQN and PPO learn policies prioritizing safety and operational efficiency. Comparative analyses with actual vessel routes demonstrate that RL-based methods yield shorter and safer paths. Among these methods, PPO outperforms DQN, providing more stable and coherent routes. Post-processing with the Douglas-Peucker (DP) algorithm further simplifies the paths for practical navigational use. The findings underscore the potential of RL in enhancing navigational safety, reducing travel distance, and advancing autonomous ship navigation technologies.

3.

2024.10 KCI 등재 구독 인증기관 무료, 개인회원 유료

다양한 상황 최적 전이기법 패턴 점검 및 강화학습 기반 최적 전이시간 선택모형 연구

Reinforcement Learning and Selection Patterns for Optimal Traffic Signal Transition Techniques

조용빈, 김진태

한국도로학회논문집 제26권 제5호 통권127호 pp.113-122 한국도로학회

PURPOSES : In this study, the existence of an optimal pattern among transition methods applied during changes in traffic signal timing was investigated. We aimed to develop this pattern into an artificial intelligence reinforcement-learning model to assess its effectiveness METHODS : By developing various traffic signal transition scenarios and considering 19 different traffic signal transition situations that can be applied to these scenarios, a simulation analysis was performed to identify patterns through statistical analysis. Subsequently, a reinforcement-learning model was developed to select an optimal transition time model suitable for various traffic conditions. This model was then tested by simulating a virtual experimental center environment and conducting performance comparison evaluations on a daily basis. RESULTS : The results indicated that when the change in the traffic signal cycle length was less than 50% in the negative direction, the subtraction method was efficient. In cases where the transition was less than 15% in the positive direction, the proposed center method for traffic signal transition was found to be advantageous. By applying the proposed optimal transition model selection, we observed that the transition time decreased by approximately 70%. CONCLUSIONS : The findings of this study provide guidance for the next level of traffic signal transitions. The importance of traffic signal transition will increase in future AI-based traffic signal control methods, requiring ongoing research in this field.

4,000원

4.

2024.06 KCI 등재 구독 인증기관 무료, 개인회원 유료

A Reinforcement Learning Model for Dispatching System through Agent-based Simulation

에이전트 기반 시뮬레이션을 통한 디스패칭 시스템의 강화학습 모델

Minjung Kim, Moonsoo Shin

한국산업경영시스템학회지 Vol.47 No.2 pp.116-123 한국산업경영시스템학회

In the manufacturing industry, dispatching systems play a crucial role in enhancing production efficiency and optimizing production volume. However, in dynamic production environments, conventional static dispatching methods struggle to adapt to various environmental conditions and constraints, leading to problems such as reduced production volume, delays, and resource wastage. Therefore, there is a need for dynamic dispatching methods that can quickly adapt to changes in the environment. In this study, we aim to develop an agent-based model that considers dynamic situations through interaction between agents. Additionally, we intend to utilize the Q-learning algorithm, which possesses the characteristics of temporal difference (TD) learning, to automatically update and adapt to dynamic situations. This means that Q-learning can effectively consider dynamic environments by sensitively responding to changes in the state space and selecting optimal dispatching rules accordingly. The state space includes information such as inventory and work-in-process levels, order fulfilment status, and machine status, which are used to select the optimal dispatching rules. Furthermore, we aim to minimize total tardiness and the number of setup changes using reinforcement learning. Finally, we will develop a dynamic dispatching system using Q-learning and compare its performance with conventional static dispatching methods.

4,000원

5.

2024.06 구독 인증기관·개인회원 무료

자율운항선박의 지역경로 생성을 위한 역강화학습 시나리오 구현

Implementation of Inverse Reinforcement Learning Scenarios for Local Path Planning of Autonomous Ships

이재용, 남궁호, 김주성, 이진석, 장다운

해양환경안전학회 학술대회 논문집 2024년도 춘계학술발표회 p.42 해양환경안전학회

6.

2024.05 구독 인증기관 무료, 개인회원 유료

에이전트 기반 시뮬레이션을 통한 디스패칭 시스템의 강화학습 모델

A Reinforcement Learning Model for Dispatching System through Agent-based Simulation

김민정, 신문수

한국산업경영시스템학회 학술대회 디지털 전환시대의 중소기업 혁신전략 pp.189-197 한국산업경영시스템학회

4,000원

7.

2024.05 구독 인증기관 무료, 개인회원 유료

심층 강화학습을 이용한 미사일 방어체계 연구 - 멀테이에전트 기반의 개선된 DQN을 활용하여 -

Research on Missile Defense Systems Using Deep Reinforcement Learning

김민국, 최봉완, 경지훈

한국산업경영시스템학회 학술대회 디지털 전환시대의 중소기업 혁신전략 pp.34-40 한국산업경영시스템학회

4,000원

8.

2023.12 KCI 등재 구독 인증기관·개인회원 무료

A Study on the Influence of Platform Design in Level Design by utilizing Multi-agent Reinforcement Learning

강화학습을 이용한 대전 슈팅 게임의 플랫폼 형태에 따른 레벨 디자인 영향 분석

Jun Ho KIM, Hanul Sung

한국컴퓨터게임학회 논문지 제36권 제4호 pp.236-29 한국컴퓨터게임학회

다중 에이전트 강화학습의 발전과 함께 게임 분야에서 강화학습을 레벨 디자인에 적용하려는 연구가 계속되 고 있다. 플랫폼의 형태가 레벨 디자인의 중요한 요소임에도 불구하고 지금까지의 연구들은 플레이어의 스킬 수준이나, 스킬 구성 등 플레이어의 매트릭에 초첨을 맞춰 강화학습을 활용하였다. 따라서 본 논문에서는 레 벨 디자인에 플랫폼의 형태가 사용될 수 있도록 시각 센서의 가시성과 구조물의 복잡성을 고려하여 플랫폼 이 플레이 경험에 미치는 영향을 연구한다. 이를 위해Unity ML-Agents Toolkit과MA-POCA 알고리즘, Self-play 방식을 기반으로2vs2 대전 슈팅 게임 환경을 개발하였으며 다양한 플랫폼의 형태를 구성하였다. 분석을 통해 플랫폼의 형태에 따른 가시성과 복잡성의 차이가 승률 밸런스에는 크게 영향을 미치지 않으나 전체 에피소 드 수, 무승부 비율, Elo의 증가폭에 유의미한 영향을 미치는 것을 확인했다.

9.

2023.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

A Player-Like StarCraft II AI for Enhanced Fun and Diversity using Reinforcement Learning

향상된 유닛 생산과 게임 플레이 경험을 위한 스타크래프트 II 강화학습 AI

Kyo Seoung KOO, Hanul Sung

한국컴퓨터게임학회 논문지 제36권 제4호 pp.31-36 한국컴퓨터게임학회

기존의 스타크래프트II 내장AI는 미리 정의한 행동 패턴을 따르기 때문에 사용자가 전략을 쉽게 파악할 수 있어 사용자의 흥미를 오랫동안 유지시키기 힘들다. 이를 해결하기 위해, 많은 강화학습 기반의 스타크래프 트II AI 연구가 진행되었다. 그러나 기존의 강화학습AI는 승률에만 중점을 두고 에이전트를 학습시킴으로써 소수의 유닛을 사용하거나 정형화 된 전략만을 사용하여 여전히 사용자들이 게임의 재미를 느끼기에 한계가 존재한다. 본 논문에서는 게임의 재미를 향상시키기 위하여, 강화학습을 활용하여 실제 플레이어와 유사한 AI을 제안한다. 에이전트에게 스타크래프트II의 상성표를 학습시키고, 정찰한 정보로 보상을 부여해 유동적 으로 전략을 변경하도록 한다. 실험 결과, 사용자가 느끼는 재미와 난이도, 유사도 부분에서 고정된 전략을 사용하는 에이전트보다 본 논문에서 제안하는 에이전트가 더 높은 평가를 받았다..

4,000원

10.

2023.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

이종 병렬설비에서 총납기지연 최소화를 위한 강화학습 기반 일정계획 알고리즘

Scheduling Algorithm, Based on Reinforcement Learning for Minimizing Total Tardiness in Unrelated Parallel Machines

이태희, 김재곤, 유우식

대한안전경영과학회지 제25권 제4호 pp.131-140 대한안전경영과학회

This paper proposes an algorithm for the Unrelated Parallel Machine Scheduling Problem(UPMSP) without setup times, aiming to minimize total tardiness. As an NP-hard problem, the UPMSP is hard to get an optimal solution. Consequently, practical scenarios are solved by relying on operator's experiences or simple heuristic approaches. The proposed algorithm has adapted two methods: a policy network method, based on Transformer to compute the correlation between individual jobs and machines, and another method to train the network with a reinforcement learning algorithm based on the REINFORCE with Baseline algorithm. The proposed algorithm was evaluated on randomly generated problems and the results were compared with those obtained using CPLEX, as well as three scheduling algorithms. This paper confirms that the proposed algorithm outperforms the comparison algorithms, as evidenced by the test results.

4,000원

11.

2023.12 구독 인증기관·개인회원 무료

픽업 대기 시간의 감소를 위한 강화 학습 기반의 승차 공유 서비스의 매칭 및 요금 책정 알고리즘

Matching and Pricing Algorithm in a Ride-hailing Service Based on Reinforcement Learning to Reduce Pickup Waiting Time

정다운

한국도로학회지 제25권 제4호 pp.117-118 한국도로학회

12.

2023.12 구독 인증기관·개인회원 무료

강화학습을 활용한 도시부 도로의 Speed Zoning Model 개발 연구

Development of a Speed Zoning Model on Urban Roads Using Reinforcement Learning

강민지

한국도로학회지 제25권 제4호 pp.115-116 한국도로학회

13.

2023.11 구독 인증기관·개인회원 무료

강화학습기반 인명안전피난 알고리즘 개발 가능성

Possibility of developing a reinforcement learning-based human safety evacuation algorithm

황광일, 김별

해양환경안전학회 학술대회 논문집 2023년도 추계학술발표회 p.82 해양환경안전학회

14.

2023.10 구독 인증기관·개인회원 무료

GPR 영상을 활용한 콘크리트 기반시설 철근 탐지 훈련 데이터의 품질 보증을 위한 딥러닝 기반 프로그램 개발

Development of Deep Learning-based Program for Quality Assurance of Training Data for Reinforcement Bar Detection on Concrete Infrastructures Using GPR Images

엘립세 카를로, 김영민, 신재철, 백유진

한국도로학회 학술대회 발표논문 초록집 2023년도 제23회 한국도로학회 학술대회 발표논문 초록집 pp.121-122 한국도로학회

15.

2023.06 KCI 등재 구독 인증기관 무료, 개인회원 유료

다중 에이전트 강화학습을 이용한 RC보 최적설계 기술개발

Development of Optimal Design Technique of RC Beam using Multi-Agent Reinforcement Learning

강주원, 김현수

한국공간구조학회지 제23권 제2호 pp.29-36 한국공간구조학회

Reinforcement learning (RL) is widely applied to various engineering fields. Especially, RL has shown successful performance for control problems, such as vehicles, robotics, and active structural control system. However, little research on application of RL to optimal structural design has conducted to date. In this study, the possibility of application of RL to structural design of reinforced concrete (RC) beam was investigated. The example of RC beam structural design problem introduced in previous study was used for comparative study. Deep q-network (DQN) is a famous RL algorithm presenting good performance in the discrete action space and thus it was used in this study. The action of DQN agent is required to represent design variables of RC beam. However, the number of design variables of RC beam is too many to represent by the action of conventional DQN. To solve this problem, multi-agent DQN was used in this study. For more effective reinforcement learning process, DDQN (Double Q-Learning) that is an advanced version of a conventional DQN was employed. The multi-agent of DDQN was trained for optimal structural design of RC beam to satisfy American Concrete Institute (318) without any hand-labeled dataset. Five agents of DDQN provides actions for beam with, beam depth, main rebar size, number of main rebar, and shear stirrup size, respectively. Five agents of DDQN were trained for 10,000 episodes and the performance of the multi-agent of DDQN was evaluated with 100 test design cases. This study shows that the multi-agent DDQN algorithm can provide successfully structural design results of RC beam.

4,000원

16.

2023.04 구독 인증기관·개인회원 무료

역강화학습을 활용한 자율운항선박의 지역경로 생성 방안

Local Path Planning of Maritime Autonomous Surface Ships using Inverse Reinforcement Learning

이재용, 남궁호, 김주성

해양환경안전학회 학술대회 논문집 2023년도 공동학술대회 해양환경안전학회 춘계학술발표회 p.32 해양환경안전학회

17.

2022.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

Reinforcement Learning-based Dynamic Weapon Assignment to Multi-Caliber Long-Range Artillery Attacks

다종 장사정포 공격에 대한 강화학습 기반의 동적 무기할당

Hyeonho Kim, Jung Hun Kim, Joohoe Kong, Ji Hoon Kyung

한국산업경영시스템학회지 Vol. 45 No. 4 pp.42-52 한국산업경영시스템학회

North Korea continues to upgrade and display its long-range rocket launchers to emphasize its military strength. Recently Republic of Korea kicked off the development of anti-artillery interception system similar to Israel’s “Iron Dome”, designed to protect against North Korea’s arsenal of long-range rockets. The system may not work smoothly without the function assigning interceptors to incoming various-caliber artillery rockets. We view the assignment task as a dynamic weapon target assignment (DWTA) problem. DWTA is a multistage decision process in which decision in a stage affects decision processes and its results in the subsequent stages. We represent the DWTA problem as a Markov decision process (MDP). Distance from Seoul to North Korea’s multiple rocket launchers positioned near the border, limits the processing time of the model solver within only a few second. It is impossible to compute the exact optimal solution within the allowed time interval due to the curse of dimensionality inherently in MDP model of practical DWTA problem. We apply two reinforcement-based algorithms to get the approximate solution of the MDP model within the time limit. To check the quality of the approximate solution, we adopt Shoot-Shoot-Look(SSL) policy as a baseline. Simulation results showed that both algorithms provide better solution than the solution from the baseline strategy.

4,200원

18.

2022.10 구독 인증기관·개인회원 무료

Application of deep reinforcement learning to major solar flare forecast

Kangwoo Yi, Yong-Jae Moon

천문학회보 제47권 2호 pp.55-56 한국천문학회

19.

2022.06 구독 인증기관·개인회원 무료

강화학습을 이용한 해양수색구조 의사결정지원 시스템 개발

Development of a Dicision Support System for Maritime Search and Rescue using Reinforcement Learning

최보라, 우동한, 임남균

해양환경안전학회 학술대회 논문집 2022년도 춘계학술발표회 p.122 해양환경안전학회

20.

2022.06 KCI 등재 구독 인증기관 무료, 개인회원 유료

스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm

김현수, 윤기용

한국공간구조학회지 제22권 제2호 pp.39-46 한국공간구조학회

Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyperparameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

4,000원