논문 상세보기

Mixed Oversampling Using BAGAN-GP and Oversampling Techniques KCI 등재

BAGAN-GP와 오버샘플링 기법들을 이용한 혼합 오버샘플링

  • 언어KOR
  • URLhttps://db.koreascholar.com/Article/Detail/438194
구독 기관 인증 시 무료 이용이 가능합니다. 4,000원
한국산업경영시스템학회지 (Journal of Society of Korea Industrial and Systems Engineering)
한국산업경영시스템학회 (Society of Korea Industrial and Systems Engineering)
초록

Defective product data is often very few because it is difficult to obtain defective product data while good product data is rich in manufacturing system. One of the frequently used methods to resolve the problems caused by data imbalance is data augmentation. Data augmentation is a method of increasing data from a minor class with a small number of data to be similar to the number of data from a major class with a large number of data. BAGAN-GP uses an autoencoder in the early stage of learning to infer the distribution of the major class and minor class and initialize the weights of the GAN. To resolve the weight clipping problem where the weights are concentrated on the boundary, the gradient penalty method is applied to appropriately distribute the weights within the range. Data augmentation techniques such as SMOTE, ADASYN, and Borderline-SMOTE are linearity-based techniques that connect observations with a line segment and generate data by selecting a random point on the line segment. On the other hand, BAGAN-GP does not exhibit linearity because it generates data based on the distribution of classes. Considering the generation of data with various characteristics and rare defective data, MO1 and MO2 techniques are proposed. The data is augmented with the proposed augmentation techniques, and the performance is compared with the cases augmented with existing techniques by classifying them with MLP, SVM, and random forest. The results of MO1 is good in most cases, which is believed to be because the data was augmented more diversely by using the existing oversampling technique based on linearity and the BAGAN-GP technique based on the distribution of class data, respectively.

목차
1. 서 론
2. 오버샘플링 기법
    2.1 선형 오버샘플링: SMOTE, ADASYN,Borderline-SMOTE
    2.2 GAN
    2.3 BAGAN-GP
3. 제안하는 기법
4. 분류를 위해 적용된 기법
    4.1 SVM
    4.2 RF
    4.3 MLP
5. 실험
    5.1 실험조건
    5.2 실험결과
6. 결 론
References
저자
  • Dongju Lee(Department of Industrial Engineering, Kongju National University) | 이동주 (공주대학교 산업공학과) Corresponding author