BAGAN-GP와 오버샘플링 기법들을 이용한 혼합 오버샘플링
Defective product data is often very few because it is difficult to obtain defective product data while good product data is rich in manufacturing system. One of the frequently used methods to resolve the problems caused by data imbalance is data augmentation. Data augmentation is a method of increasing data from a minor class with a small number of data to be similar to the number of data from a major class with a large number of data. BAGAN-GP uses an autoencoder in the early stage of learning to infer the distribution of the major class and minor class and initialize the weights of the GAN. To resolve the weight clipping problem where the weights are concentrated on the boundary, the gradient penalty method is applied to appropriately distribute the weights within the range. Data augmentation techniques such as SMOTE, ADASYN, and Borderline-SMOTE are linearity-based techniques that connect observations with a line segment and generate data by selecting a random point on the line segment. On the other hand, BAGAN-GP does not exhibit linearity because it generates data based on the distribution of classes. Considering the generation of data with various characteristics and rare defective data, MO1 and MO2 techniques are proposed. The data is augmented with the proposed augmentation techniques, and the performance is compared with the cases augmented with existing techniques by classifying them with MLP, SVM, and random forest. The results of MO1 is good in most cases, which is believed to be because the data was augmented more diversely by using the existing oversampling technique based on linearity and the BAGAN-GP technique based on the distribution of class data, respectively.