기후변화와 식품공급망의 복잡성 증대로 식품 위해요소 의 발생 경로와 패턴이 다변화됨에 따라, 과학적 예측과 선 제적 개입이 가능한 예방형 식품안전 관리체계의 필요성이 대두되고 있다. 본 연구는 기후·환경 요인이 식품 위해요소 에 미치는 영향을 분석함으로써, 기후 민감성이 높은 위해 요소를 식별하고 예측 가능성과 주요 환경인자를 도출하였 다. 아울러 국내외 데이터 기반 위해예측 시스템의 운영 사 례를 비교·분석함으로써, 식품위해예측센터의 실질적 운영 과 역할을 위한 발전방향을 제시하였다. 본 연구를 통해 향 후 식품위해예측센터가 식품안전 정책의 과학화와 지능화 를 이끄는 전략적 플랫폼으로 기능하고, 예방 중심의 관리 체계로의 전환을 유도할 수 있도록 실효적 토대와 정책적 방향성을 제공하고자 한다.
This study aims to improve the interpretability and transparency of forecasting results by applying an explainable AI technique to corporate default prediction models. In particular, the research addresses the challenges of data imbalance and the economic cost asymmetry of forecast errors. To tackle these issues, predictive performance was analyzed using the SMOTE-ENN imbalance sampling technique and a cost-sensitive learning approach. The main findings of the study are as follows. First, the four machine learning models used in this study (Logistic Regression, Random Forest, XGBoost, and CatBoost) produced significantly different evaluation results depending on the degree of asymmetry in forecast error costs between imbalance classes and the performance metrics applied. Second, XGBoost and CatBoost showed good predictive performance when considering variations in prediction cost asymmetry and diverse evaluation metrics. In particular, XGBoost showed the smallest gap between the actual default rate and the default judgment rate, highlighting its robustness in handling class imbalance and prediction cost asymmetry. Third, SHAP analysis revealed that total assets, net income to total assets, operating income to total assets, financial liability to total assets, and the retained earnings ratio were the most influential factors in predicting defaults. The significance of this study lies in its comprehensive evaluation of predictive performance of various ML models under class imbalance and cost asymmetry in forecast errors. Additionally, it demonstrates how explainable AI techniques can enhance the transparency and reliability of corporate default prediction models.
본 연구는 한국 기상대 데이터를 활용하여 콘크리트 포장의 깊이별 온도를 예측하는 ANN(Artificial Neural Network) 모델을 개발하는 것을 목표로 한다. 기존의 열평형 방정식 기반 모델은 특정 지역의 기상 데이터를 필요로 하기 때문에 일반적인 적용이 어렵다는 한계를 가지고 있다. 이에 본 연구에서는 ANN을 활용하여 기상대 데이터를 기반으로 범용적 인 온도 예측 모델을 개발하고자 한다. 이를 통해 다양한 지역 및 환경 조건에서도 적용 가능한 모델을 구축하는 것이 목적이다. 본 연구에서는 2017년 1월 1일부터 2018년 12월 31일까지의 1시간 단위 기상 및 온도 데이터를 활용하며, 0.05m, 0.15m, 0.25m, 0.35m, 0.45m 깊이별 온도 데이터를 학습 데이터로 사용한다. 입력 변수로는 기온, 풍속, 강수량, 습도, 일 조량, 일사량, 적설량, 적운량, 지면온도를 포함한다. 이러한 다양한 기상 데이터를 활용하여 신경망 모델을 학습하고, 기 존 방식보다 높은 정확도를 확보하는 것이 연구의 핵심 목표이다. 기존 ANN 구조인 O = WI + B에서 확장된 O = W(I + (WI + B)) + B 형태의 비선형 구조를 적용하여 기존 모델이 가지는 비선형 관계 반영의 한계를 극복하고자 한다. 또한, 선형 다중 은닉층 모델과 비선형 다중 은닉층 모델을 각각 개발하여 성능을 비교하고, 비선형 모델의 필요성과 일반화 능력을 평가할 예정이다. 최종적으로 두 모델의 성능을 평균 제곱 오차 및 평균 절대 오차 등과 같은 평가 지표들을 이용하여 비교 분석하고, 가장 적합한 모델을 도출하고자 한다.
This study develops a machine learning-based tool life prediction model using spindle power data collected from real manufacturing environments. The primary objective is to monitor tool wear and predict optimal replacement times, thereby enhancing manufacturing efficiency and product quality in smart factory settings. Accurate tool life prediction is critical for reducing downtime, minimizing costs, and maintaining consistent product standards. Six machine learning models, including Random Forest, Decision Tree, Support Vector Regressor, Linear Regression, XGBoost, and LightGBM, were evaluated for their predictive performance. Among these, the Random Forest Regressor demonstrated the highest accuracy with R2 value of 0.92, making it the most suitable for tool wear prediction. Linear Regression also provided detailed insights into the relationship between tool usage and spindle power, offering a practical alternative for precise predictions in scenarios with consistent data patterns. The results highlight the potential for real-time monitoring and predictive maintenance, significantly reducing downtime, optimizing tool usage, and improving operational efficiency. Challenges such as data variability, real-world noise, and model generalizability across diverse processes remain areas for future exploration. This work contributes to advancing smart manufacturing by integrating data-driven approaches into operational workflows and enabling sustainable, cost-effective production environments.
This study integrates TabTransformer and CTGAN for predicting job satisfaction among South Korean college graduates. TabTransformer handles complex tabular data relationships with self-attention, while CTGAN generates high-quality synthetic samples. The combined approach achieves an accuracy of 0.85, precision of 0.83, recall of 0.82, F1-score of 0.82, and an AUC of 0.88. Cross-validation confirms the model's robustness and generalizability with a mean accuracy of 0.85 and a standard deviation of 0.008. The integration of TabTransformer and CTGAN enhances predictive accuracy and model generalizability, providing valuable insights for employment policy and research.
PURPOSES : This study aimed to predict the number of future COVID-19 confirmed cases more accurately using public and transportation big data and suggested priorities for introducing major policies by region. METHODS : Prediction analysis was performed using a long short-term memory (LSTM) model with excellent prediction accuracy for time-series data. Random forest (RF) classification analysis was used to derive regional priorities and major influencing factors. RESULTS : Based on the daily number of COVID-19 confirmed cases from January 26 to December 12, 2020, as well as the daily number of confirmed cases in Gyeonggi Province, which was expected to occur on December 24 and 25, depending on social distancing, the accuracy of the LSTM artificial neural network was approximately 95.8%. In addition, as a result of deriving the major influencing factors of COVID-19 through random forest classification analysis, according to the number of people, social distancing stages, and masks worn, Bucheon, Yongin, and Pyeongtaek were identified as regions expected to be at high risk in the future. CONCLUSIONS : The results of this study can help predict pandemics such as COVID-19.
This study was conducted to develop a model for predicting the growth of kimchi cabbage using image data and environmental data. Kimchi cabbages of the ‘Cheongmyeong Gaual’ variety were planted three times on July 11th, July 19th, and July 27th at a test field located at Pyeongchang-gun, Gangwon-do (37°37′ N 128°32′ E, 510 elevation), and data on growth, images, and environmental conditions were collected until September 12th. To select key factors for the kimchi cabbage growth prediction model, a correlation analysis was conducted using the collected growth data and meteorological data. The correlation coefficient between fresh weight and growth degree days (GDD) and between fresh weight and integrated solar radiation showed a high correlation coefficient of 0.88. Additionally, fresh weight had significant correlations with height and leaf area of kimchi cabbages, with correlation coefficients of 0.78 and 0.79, respectively. Canopy coverage was selected from the image data and GDD was selected from the environmental data based on references from previous researches. A prediction model for kimchi cabbage of biomass, leaf count, and leaf area was developed by combining GDD, canopy coverage and growth data. Single-factor models, including quadratic, sigmoid, and logistic models, were created and the sigmoid prediction model showed the best explanatory power according to the evaluation results. Developing a multi-factor growth prediction model by combining GDD and canopy coverage resulted in improved determination coefficients of 0.9, 0.95, and 0.89 for biomass, leaf count, and leaf area, respectively, compared to single-factor prediction models. To validate the developed model, validation was conducted and the determination coefficient between measured and predicted fresh weight was 0.91, with an RMSE of 134.2 g, indicating high prediction accuracy. In the past, kimchi cabbage growth prediction was often based on meteorological or image data, which resulted in low predictive accuracy due to the inability to reflect on-site conditions or the heading up of kimchi cabbage. Combining these two prediction methods is expected to enhance the accuracy of crop yield predictions by compensating for the weaknesses of each observation method.
기후변화 영향으로 이상고수온, 태풍, 홍수, 가뭄 등 재난 및 안전 관리기술은 지속적으로 고도화를 요구받고 있으며, 특히 해 수면 온도는 한반도 주변에서 발생되는 여름철 적조 발생과 동해안 냉수대 출현, 소멸 등에 영향을 신속하게 분석할 수 있는 중요한 인자 이다. 따라서, 본 연구에서는 해수면 온도 자료를 해양 이상현상 및 연구에 적극 활용되기 위해 통계적 방법과 딥러닝 알고리즘을 적용하 여 예측성능을 평가하였다. 예측에 사용된 해수면 수온자료는 흑산도 조위관측소의 2018년부터 2022년까지 자료이며, 기존 통계적 ARIMA 방법과 Long Short-Term Memory(LSTM), Gated Recurrent Unit(GRU)을 사용하였고, LSTM의 성능을 더욱 향상할 수 있는 Sequence-to-Sequence(s2s) 구조에 Attention 기법을 추가한 Attention Long Short-Term Memory (LSTM)기법을 사용하여 예측 성능 평가를 진행하 였다. 평가 결과 Attention LSTM 모델이 타 모델과 비교하여 더 좋은 성능을 보였으며, Hyper parameter 튜닝을 통해 해수면 수온 성능을 개 선할 수 있었다.
Nowadays, artificial intelligence model approaches such as machine and deep learning have been widely used to predict variations of water quality in various freshwater bodies. In particular, many researchers have tried to predict the occurrence of cyanobacterial blooms in inland water, which pose a threat to human health and aquatic ecosystems. Therefore, the objective of this study were to: 1) review studies on the application of machine learning models for predicting the occurrence of cyanobacterial blooms and its metabolites and 2) prospect for future study on the prediction of cyanobacteria by machine learning models including deep learning. In this study, a systematic literature search and review were conducted using SCOPUS, which is Elsevier’s abstract and citation database. The key results showed that deep learning models were usually used to predict cyanobacterial cells, while machine learning models focused on predicting cyanobacterial metabolites such as concentrations of microcystin, geosmin, and 2-methylisoborneol (2-MIB) in reservoirs. There was a distinct difference in the use of input variables to predict cyanobacterial cells and metabolites. The application of deep learning models through the construction of big data may be encouraged to build accurate models to predict cyanobacterial metabolites.
PURPOSES : Due to the frequent occurrence of accidents on icy roads during nighttime, it would be advantageous to notify road managers and drivers about the most perilous areas. This would allow road managers to treat the icy roads with de-icing chemicals and enable drivers to be better prepared for potential hazards. Essential information about pavement temperature is required to identify icy spots on the road. METHODS : With the goal of estimating nighttime pavement temperature on the National Highways in Korea using atmospheric data, the current study investigated a widely recognized forecasting method known as deep neural network (DNN). To achieve this objective, the input data for the models were gathered from the weather agency's website. The dataset comprised of relative humidity, air temperature, dew point temperature, as well as the differences in air temperature and humidity between two consecutive days. RESULTS : In order to assess the effectiveness of the built DNN model, a comparison was made using baseline pavement temperature data gathered through an infrared-based pavement temperature sensor installed in a highway patrol car. The results indicated that the DNN model achieved a mean absolute error (MAE) of 0.42 and a root mean square error (RMSE) of 0.62. In comparison, a conventional regression model yielded an MAE of 2.07 and an RMSE of 2.64. Thus, the DNN model demonstrated superior performance in comparison to the conventional regression model. CONCLUSIONS : Considering the increasing focus on preventive maintenance, these newly developed prediction models can be implemented proactively as a preventive measure against icing. This proactive approach has the potential to significantly improve traffic safety on winter roads.