논문 상세보기

효과적인 인간-로봇 상호작용을 위한 딥러닝 기반 로봇 비전 자연어 설명문 생성 및 발화 기술 KCI 등재

Robot Vision to Audio Description Based on Deep Learning for Effective Human-Robot Interaction

  • 언어KOR
  • URLhttps://db.koreascholar.com/Article/Detail/366349
서비스가 종료되어 열람이 제한될 수 있습니다.
로봇학회논문지 (The Journal of Korea Robotics Society)
한국로봇학회 (Korea Robotics Society)
초록

For effective human-robot interaction, robots need to understand the current situation context well, but also the robots need to transfer its understanding to the human participant in efficient way. The most convenient way to deliver robot’s understanding to the human participant is that the robot expresses its understanding using voice and natural language. Recently, the artificial intelligence for video understanding and natural language process has been developed very rapidly especially based on deep learning. Thus, this paper proposes robot vision to audio description method using deep learning. The applied deep learning model is a pipeline of two deep learning models for generating natural language sentence from robot vision and generating voice from the generated natural language sentence. Also, we conduct the real robot experiment to show the effectiveness of our method in human-robot interaction.

목차
Abstract
 1. 서 론
 2. 자연어 설명문과 합성음을 생성하는딥러닝 모델
  2.1 전체 모델 구조
  2.2 로봇 비전 비디오 자연어 설명문 생성 모델
  2.3 로봇 비전 비디오 자연어 설명문 생성 모델의 입출력데이터
  2.4 합성음 생성 모델
 3. 실 험
  3.1 학습 데이터 셋, 하이퍼파라미터 및 로컬 서버 환경
  3.2 동영상에 대한 음성 평가 및 상황 설명 평가
  3.3 실시간 로봇 실험 환경
  3.4 실시간 로봇 실험 결과
 4. 결 론
 References
저자
  • 박동건(Dept. of Computer Science and Engineering, Seoul National University of Science and Technology) | Dongkeon Park
  • 강경민(Dept. of Computer Science and Engineering, Seoul National University of Science and Technology) | Kyeong-Min Kang
  • 배진우(Dept. of Computer Science and Engineering, Seoul National University of Science and Technology) | Jin-Woo Bae
  • 한지형(Dept. of Computer Science and Engineering, Seoul National University of Science and Technology) | Ji-Hyeong Han Corresponding author