검색결과 - koreascholar

1.

2025.10 KCI 등재 구독 인증기관 무료, 개인회원 유료

생성형 인공지능 기반 데이터 불균형 처리 기법을 활용한 조류 발생 예측 머신러닝 모형 성능 분석

Performance analysis of algal bloom prediction machine learning model using generative artificial intelligence-based data imbalance mitigation techniques

김준오, 박정수

상하수도학회지 제39권 제5호 pp.347-358 대한상하수도학회

과도한 조류 발생은 수생태계 교란과 수질 악화를 초래하는 대표적인 환경 문제로, 효과적인 관리와 대응을 위해 정확한 예측이 필요하다. 우리나라는 사계절의 기후 특성이 뚜렷하며, 수온이 상승하는 하절기에 조류 발생이 집중되는 경향을 보인다. 이에 따라 실시간 모니터링 자료는 대부분 저농도 상태가 유지되어 데이터 불균형 문제가 발생한다. 본 연구에서는 chlorophyll-a 농도를 기준으로 하천 현장의 조류 발생 수준을 Class 1 (Chl-a ≤ 10 ㎍/L), Class 2 (10 < Chl-a ≤ 50 ㎍/L), Class 3 (Chl-a > 50 ㎍/L)와 같이 3개의 class로 구분하고, 대표적인 앙상블 머신러닝 모형인 extreme gradient boosting (XGB) 알고리즘을 이용하여 조류 발생 수준을 예측하는 분류 모형을 구축하였다. 데이터 불균형 해소를 위해 생성형 인공지능 기반 알고리즘인 conditional generative adversarial network (CGAN)과 전통적인 데이터 보강 알고리즘인 synthetic minority over-sampling technique (SMOTE), 그리고 딥러닝 기반 기법인 autoencoder (AE)를 활용한 3가지 데이터 보강 알고리즘을 활용하여 데이터의 불균형을 개선한 자료를 생성하고 이를 XGB 모형에 적용하여 성능 변화를 비교하였다. 분석 결과 macro average 기준으로 원본 데이터를 사용한 모형의 recall은 0.606이었으나 SMOTE, AE 및 CGAN의 recall은 각각 0.666, 0.682, 0.720으로 크게 개선되었고, F1 score도 데이터 불균형 해소를 통해 약 7–13%의 성능이 향상되는 등 전체적으로 데이터 불균형 해소로 모형의 성능이 향상되었으며 CGAN이 가장 우수한 성능 개선 효과를 보이는 것으로 나타냈다. 본 연구의 결과를 통해 데이터 불균형 해소를 통한 머신러닝 모형 성능 개선 가능성을 확인하였다.

4,300원

2.

2025.03 KCI 등재 구독 인증기관 무료, 개인회원 유료

eXplainable Artificial Intelligence Applied to Corporate Bankruptcy Prediction with Severely Imbalance Data

데이터 클래스 불균형 상황에서 설명가능 인공지능을 이용한 기업부도예측모델의 적용과 해석에 관한 연구

Jong Chul Yune, Dong Hyun Back

한국산업경영시스템학회지 Vol.48 No.1 pp.9-22 한국산업경영시스템학회

This study aims to improve the interpretability and transparency of forecasting results by applying an explainable AI technique to corporate default prediction models. In particular, the research addresses the challenges of data imbalance and the economic cost asymmetry of forecast errors. To tackle these issues, predictive performance was analyzed using the SMOTE-ENN imbalance sampling technique and a cost-sensitive learning approach. The main findings of the study are as follows. First, the four machine learning models used in this study (Logistic Regression, Random Forest, XGBoost, and CatBoost) produced significantly different evaluation results depending on the degree of asymmetry in forecast error costs between imbalance classes and the performance metrics applied. Second, XGBoost and CatBoost showed good predictive performance when considering variations in prediction cost asymmetry and diverse evaluation metrics. In particular, XGBoost showed the smallest gap between the actual default rate and the default judgment rate, highlighting its robustness in handling class imbalance and prediction cost asymmetry. Third, SHAP analysis revealed that total assets, net income to total assets, operating income to total assets, financial liability to total assets, and the retained earnings ratio were the most influential factors in predicting defaults. The significance of this study lies in its comprehensive evaluation of predictive performance of various ML models under class imbalance and cost asymmetry in forecast errors. Additionally, it demonstrates how explainable AI techniques can enhance the transparency and reliability of corporate default prediction models.

4,600원

3.

2024.10 KCI 등재 구독 인증기관 무료, 개인회원 유료

고용 빅데이터에서 결과 변수의 계층 불균형 문제를 해결하기 위한 조건부 표 형식의 생성적 적대적 네트워크(GAN)의 응용

Application of Conditional Tabular Generative Adversarial Networks (GAN) for Addressing Class Imbalance in Nationwide Employment Big Data

변해원

한국기계기술학회지 제26권 제5호 pp.911-926 한국기계기술학회

This study investigates using Conditional Tabular Generative Adversarial Networks (CT-GAN) to generate synthetic data for turnover prediction in large employment datasets. The effectiveness of CT-GAN is compared with Adaptive Synthetic Sampling (ADASYN), Synthetic Minority Over-sampling Technique (SMOTE), and Random Oversampling (ROS) using Logistic Regression (LR), Linear Discriminant Analysis (LDA), Random Forest (RF), and Extreme Learning Machines (ELM), evaluated with AUC and F1-scores. Results show that GAN-based techniques, especially CT-GAN, outperform traditional methods in addressing data imbalance, highlighting the need for advanced oversampling methods to improve classification accuracy in imbalanced datasets.

4,900원

4.

2024.06 KCI 등재 구독 인증기관 무료, 개인회원 유료

TabNet 기반 생성적 적대 신경망(GAN)을 활용한 고용 빅데이터의 불균형 클래스 최적화 모델링

Enhancing Imbalanced Binary Classification in Employment Big Data Using TabNet-Driven Generative Adversarial Networks

변해원

한국기계기술학회지 제26권 제3호 pp.453-461 한국기계기술학회

Abstract Handling imbalanced datasets in binary classification, especially in employment big data, is challenging. Traditional methods like oversampling and undersampling have limitations. This paper integrates TabNet and Generative Adversarial Networks (GANs) to address class imbalance. The generator creates synthetic samples for the minority class, and the discriminator, using TabNet, ensures authenticity. Evaluations on benchmark datasets show significant improvements in accuracy, precision, recall, and F1-score for the minority class, outperforming traditional methods. This integration offers a robust solution for imbalanced datasets in employment big data, leading to fairer and more effective predictive models.

4,000원

5.

2023.11 구독 인증기관·개인회원 무료

사출성형공정에서 적대적 생성 신경망을 이용한 데이터불균형 해소

권영석, 최수진, 이창용, 이동주

한국산업경영시스템학회 학술대회 2023년 한국산업경영시스템학회 추계학술대회 p.323 한국산업경영시스템학회

6.

2022.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

Simulated Annealing for Overcoming Data Imbalance in Mold Injection Process

사출성형공정에서 데이터의 불균형 해소를 위한 담금질모사

Dongju Lee

한국산업경영시스템학회지 Vol. 45 No. 4 pp.233-239 한국산업경영시스템학회

The injection molding process is a process in which thermoplastic resin is heated and made into a fluid state, injected under pressure into the cavity of a mold, and then cooled in the mold to produce a product identical to the shape of the cavity of the mold. It is a process that enables mass production and complex shapes, and various factors such as resin temperature, mold temperature, injection speed, and pressure affect product quality. In the data collected at the manufacturing site, there is a lot of data related to good products, but there is little data related to defective products, resulting in serious data imbalance. In order to efficiently solve this data imbalance, undersampling, oversampling, and composite sampling are usally applied. In this study, oversampling techniques such as random oversampling (ROS), minority class oversampling (SMOTE), ADASYN(Adaptive Synthetic Sampling), etc., which amplify data of the minority class by the majority class, and complex sampling using both undersampling and oversampling, are applied. For composite sampling, SMOTE+ENN and SMOTE+Tomek were used. Artificial neural network techniques is used to predict product quality. Especially, MLP and RNN are applied as artificial neural network techniques, and optimization of various parameters for MLP and RNN is required. In this study, we proposed an SA technique that optimizes the choice of the sampling method, the ratio of minority classes for sampling method, the batch size and the number of hidden layer units for parameters of MLP and RNN. The existing sampling methods and the proposed SA method were compared using accuracy, precision, recall, and F1 Score to prove the superiority of the proposed method.

4,000원

7.

2022.11 구독 인증기관·개인회원 무료

사출성형공정에서의 데이터불균형을 고려한 품질예측에 대한 연구

A Study on Quality Prediction Considering Data Imbalances in the Injection Molding Process

박영신, 이창용, 이동주

한국산업경영시스템학회 학술대회 2022년 한국산업경영시스템학회 추계학술대회 p.284 한국산업경영시스템학회

사출성형공정은 열가소성 수지를 가열하여 유동상태로 만들어 금형의 공동부에 가압 주입한 후에 금형 내에서 냉각시키는 공정으로, 금형의 공동모양과 동일한 제품을 만드는 방법이다. 대량생산이 가능하고, 복잡한 모양이 가능한 공정으로, 수지온도, 금형온도, 사출속도, 압력 등 다양한 요소들이 제품의 품질에 영향을 미친다. 제조현장에서 수집되는 데이터는 양품과 관련된 데이터는 많은 반면, 불량품과 관련된 데이터는 적어서 데이터불균형이 심각하다. 이러한 데이터불균형을 효율적으로 해결하기 위하여 언더샘플링, 오버샘플링, 복합샘플링 등이 적용되고 있다. 본 연구에서는 랜덤오버샘플링(ROS), 소수 클래스 오버 샘플링(SMOTE), ADASTN 등의 소수클래스의 데이터를 다수클래스만큼 증폭시키는 오버샘플링 기법을 활용하고, 데이터마이닝 기법을 활용하여 품질예측을 하고자 한다.

8.

2022.06 KCI 등재 구독 인증기관 무료, 개인회원 유료

A Methodology for Bankruptcy Prediction in Imbalanced Datasets using eXplainable AI

데이터 불균형을 고려한 설명 가능한 인공지능 기반 기업부도예측 방법론 연구

Sun-Woo Heo, Dong Hyun Baek

한국산업경영시스템학회지 Vol. 45 No. 2 pp.65-76 한국산업경영시스템학회

Recently, not only traditional statistical techniques but also machine learning algorithms have been used to make more accurate bankruptcy predictions. But the insolvency rate of companies dealing with financial institutions is very low, resulting in a data imbalance problem. In particular, since data imbalance negatively affects the performance of artificial intelligence models, it is necessary to first perform the data imbalance process. In additional, as artificial intelligence algorithms are advanced for precise decision-making, regulatory pressure related to securing transparency of Artificial Intelligence models is gradually increasing, such as mandating the installation of explanation functions for Artificial Intelligence models. Therefore, this study aims to present guidelines for eXplainable Artificial Intelligence-based corporate bankruptcy prediction methodology applying SMOTE techniques and LIME algorithms to solve a data imbalance problem and model transparency problem in predicting corporate bankruptcy. The implications of this study are as follows. First, it was confirmed that SMOTE can effectively solve the data imbalance issue, a problem that can be easily overlooked in predicting corporate bankruptcy. Second, through the LIME algorithm, the basis for predicting bankruptcy of the machine learning model was visualized, and derive improvement priorities of financial variables that increase the possibility of bankruptcy of companies. Third, the scope of application of the algorithm in future research was expanded by confirming the possibility of using SMOTE and LIME through case application.

4,300원

9.

2020.10 구독 인증기관 무료, 개인회원 유료

데이터분석에서 불균형 데이터 문제

김기태

한국산업경영시스템학회 학술대회 2020년 한국산업경영시스템학회 춘추계 통합학술대회 및 제18회 대학생프로젝트경진대회 pp.121-125 한국산업경영시스템학회

4,000원