검색결과

검색조건

좁혀보기

검색필터 CLOSE

검색결과 3건

2025.03 KCI 등재 구독 인증기관 무료, 개인회원 유료

eXplainable Artificial Intelligence Applied to Corporate Bankruptcy Prediction with Severely Imbalance Data

데이터 클래스 불균형 상황에서 설명가능 인공지능을 이용한 기업부도예측모델의 적용과 해석에 관한 연구

Jong Chul Yune, Dong Hyun Back

한국산업경영시스템학회지 Vol.48 No.1 pp.9-22 한국산업경영시스템학회

This study aims to improve the interpretability and transparency of forecasting results by applying an explainable AI technique to corporate default prediction models. In particular, the research addresses the challenges of data imbalance and the economic cost asymmetry of forecast errors. To tackle these issues, predictive performance was analyzed using the SMOTE-ENN imbalance sampling technique and a cost-sensitive learning approach. The main findings of the study are as follows. First, the four machine learning models used in this study (Logistic Regression, Random Forest, XGBoost, and CatBoost) produced significantly different evaluation results depending on the degree of asymmetry in forecast error costs between imbalance classes and the performance metrics applied. Second, XGBoost and CatBoost showed good predictive performance when considering variations in prediction cost asymmetry and diverse evaluation metrics. In particular, XGBoost showed the smallest gap between the actual default rate and the default judgment rate, highlighting its robustness in handling class imbalance and prediction cost asymmetry. Third, SHAP analysis revealed that total assets, net income to total assets, operating income to total assets, financial liability to total assets, and the retained earnings ratio were the most influential factors in predicting defaults. The significance of this study lies in its comprehensive evaluation of predictive performance of various ML models under class imbalance and cost asymmetry in forecast errors. Additionally, it demonstrates how explainable AI techniques can enhance the transparency and reliability of corporate default prediction models.

4,600원

2024.10 KCI 등재 구독 인증기관 무료, 개인회원 유료

고용 빅데이터에서 결과 변수의 계층 불균형 문제를 해결하기 위한 조건부 표 형식의 생성적 적대적 네트워크(GAN)의 응용

Application of Conditional Tabular Generative Adversarial Networks (GAN) for Addressing Class Imbalance in Nationwide Employment Big Data

변해원

한국기계기술학회지 제26권 제5호 pp.911-926 한국기계기술학회

This study investigates using Conditional Tabular Generative Adversarial Networks (CT-GAN) to generate synthetic data for turnover prediction in large employment datasets. The effectiveness of CT-GAN is compared with Adaptive Synthetic Sampling (ADASYN), Synthetic Minority Over-sampling Technique (SMOTE), and Random Oversampling (ROS) using Logistic Regression (LR), Linear Discriminant Analysis (LDA), Random Forest (RF), and Extreme Learning Machines (ELM), evaluated with AUC and F1-scores. Results show that GAN-based techniques, especially CT-GAN, outperform traditional methods in addressing data imbalance, highlighting the need for advanced oversampling methods to improve classification accuracy in imbalanced datasets.

4,900원

2022.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

Simulated Annealing for Overcoming Data Imbalance in Mold Injection Process

사출성형공정에서 데이터의 불균형 해소를 위한 담금질모사

Dongju Lee

한국산업경영시스템학회지 Vol. 45 No. 4 pp.233-239 한국산업경영시스템학회

The injection molding process is a process in which thermoplastic resin is heated and made into a fluid state, injected under pressure into the cavity of a mold, and then cooled in the mold to produce a product identical to the shape of the cavity of the mold. It is a process that enables mass production and complex shapes, and various factors such as resin temperature, mold temperature, injection speed, and pressure affect product quality. In the data collected at the manufacturing site, there is a lot of data related to good products, but there is little data related to defective products, resulting in serious data imbalance. In order to efficiently solve this data imbalance, undersampling, oversampling, and composite sampling are usally applied. In this study, oversampling techniques such as random oversampling (ROS), minority class oversampling (SMOTE), ADASYN(Adaptive Synthetic Sampling), etc., which amplify data of the minority class by the majority class, and complex sampling using both undersampling and oversampling, are applied. For composite sampling, SMOTE+ENN and SMOTE+Tomek were used. Artificial neural network techniques is used to predict product quality. Especially, MLP and RNN are applied as artificial neural network techniques, and optimization of various parameters for MLP and RNN is required. In this study, we proposed an SA technique that optimizes the choice of the sampling method, the ratio of minority classes for sampling method, the batch size and the number of hidden layer units for parameters of MLP and RNN. The existing sampling methods and the proposed SA method were compared using accuracy, precision, recall, and F1 Score to prove the superiority of the proposed method.

4,000원