로봇학회논문지 제14권 제2호 (통권 제52호) (p.81-86)

보틀플리핑의 로봇 강화학습을 위한 효과적인 보상 함수의 설계

Designing an Efficient Reward Function for Robot Reinforcement Learning of The Water Bottle Flipping Task
키워드 :
Robotic Arm,Reinforcement Learning,Motion Tracking,Bottle Flipping

목차

Abstract
1. 서 론
2. 강화 학습 임무 및 시스템 설계
  2.1 임무
  2.2 시스템 설계
3. 초기 동작 생성 및 학습 방법
  3.1 초기 동작 생성
  3.2 학습의 과정
4. 보틀 플리핑을 위한 보상 함수
  4.1 최고점과 착지 순간의 보상 함수
  4.2 착지 순간의 보상 함수
5. 강화 학습의 결과
  5.1 강화 학습의 결과
6. 결 론
References

초록

Robots are used in various industrial sites, but traditional methods of operating a robot are limited at some kind of tasks. In order for a robot to accomplish a task, it is needed to find and solve accurate formula between a robot and environment and that is complicated work. Accordingly, reinforcement learning of robots is actively studied to overcome this difficulties. This study describes the process and results of learning and solving which applied reinforcement learning. The mission that the robot is going to learn is bottle flipping. Bottle flipping is an activity that involves throwing a plastic bottle in an attempt to land it upright on its bottom. Complexity of movement of liquid in the bottle when it thrown in the air, makes this task difficult to solve in traditional ways. Reinforcement learning process makes it easier. After 3-DOF robotic arm being instructed how to throwing the bottle, the robot find the better motion that make successful with the task. Two reward functions are designed and compared the result of learning. Finite difference method is used to obtain policy gradient. This paper focuses on the process of designing an efficient reward function to improve bottle flipping motion.