Korea is facing a significant problem with historically low fertility rates, which is becoming a major social issue affecting the economy, labor force, and national security. This study analyzes the factors contributing to the regional gap in fertility rates and derives policy implications. The government and local authorities are implementing a range of policies to address the issue of low fertility. To establish an effective strategy, it is essential to identify the primary factors that contribute to regional disparities. This study identifies these factors and explores policy implications through machine learning and explainable artificial intelligence. The study also examines the influence of media and public opinion on childbirth in Korea by incorporating news and online community sentiment, as well as sentiment fear indices, as independent variables. To establish the relationship between regional fertility rates and factors, the study employs four machine learning models: multiple linear regression, XGBoost, Random Forest, and Support Vector Regression. Support Vector Regression, XGBoost, and Random Forest significantly outperform linear regression, highlighting the importance of machine learning models in explaining non-linear relationships with numerous variables. A factor analysis using SHAP is then conducted. The unemployment rate, Regional Gross Domestic Product per Capita, Women's Participation in Economic Activities, Number of Crimes Committed, Average Age of First Marriage, and Private Education Expenses significantly impact regional fertility rates. However, the degree of impact of the factors affecting fertility may vary by region, suggesting the need for policies tailored to the characteristics of each region, not just an overall ranking of factors.
In this study, machine learning models are proposed to predict the Vickers hardness of AlSi10Mg alloys fabricated by laser powder bed fusion (LPBF). A total of 113 utilizable datasets were collected from the literature. The hyperparameters of the machine-learning models were adjusted to select an accurate predictive model. The random forest regression (RFR) model showed the best performance compared to support vector regression, artificial neural networks, and k-nearest neighbors. The variable importance and prediction mechanisms of the RFR were discussed by Shapley additive explanation (SHAP). Aging time had the greatest influence on the Vickers hardness, followed by solution time, solution temperature, layer thickness, scan speed, power, aging temperature, average particle size, and hatching distance. Detailed prediction mechanisms for RFR are analyzed using SHAP dependence plots.
The prediction of algal bloom is an important field of study in algal bloom management, and chlorophyll-a concentration(Chl-a) is commonly used to represent the status of algal bloom. In, recent years advanced machine learning algorithms are increasingly used for the prediction of algal bloom. In this study, XGBoost(XGB), an ensemble machine learning algorithm, was used to develop a model to predict Chl-a in a reservoir. The daily observation of water quality data and climate data was used for the training and testing of the model. In the first step of the study, the input variables were clustered into two groups(low and high value groups) based on the observed value of water temperature(TEMP), total organic carbon concentration(TOC), total nitrogen concentration(TN) and total phosphorus concentration(TP). For each of the four water quality items, two XGB models were developed using only the data in each clustered group(Model 1). The results were compared to the prediction of an XGB model developed by using the entire data before clustering(Model 2). The model performance was evaluated using three indices including root mean squared error-observation standard deviation ratio(RSR). The model performance was improved using Model 1 for TEMP, TN, TP as the RSR of each model was 0.503, 0.477 and 0.493, respectively, while the RSR of Model 2 was 0.521. On the other hand, Model 2 shows better performance than Model 1 for TOC, where the RSR was 0.532. Explainable artificial intelligence(XAI) is an ongoing field of research in machine learning study. Shapley value analysis, a novel XAI algorithm, was also used for the quantitative interpretation of the XGB model performance developed in this study.
Recently, not only traditional statistical techniques but also machine learning algorithms have been used to make more accurate bankruptcy predictions. But the insolvency rate of companies dealing with financial institutions is very low, resulting in a data imbalance problem. In particular, since data imbalance negatively affects the performance of artificial intelligence models, it is necessary to first perform the data imbalance process. In additional, as artificial intelligence algorithms are advanced for precise decision-making, regulatory pressure related to securing transparency of Artificial Intelligence models is gradually increasing, such as mandating the installation of explanation functions for Artificial Intelligence models. Therefore, this study aims to present guidelines for eXplainable Artificial Intelligence-based corporate bankruptcy prediction methodology applying SMOTE techniques and LIME algorithms to solve a data imbalance problem and model transparency problem in predicting corporate bankruptcy. The implications of this study are as follows. First, it was confirmed that SMOTE can effectively solve the data imbalance issue, a problem that can be easily overlooked in predicting corporate bankruptcy. Second, through the LIME algorithm, the basis for predicting bankruptcy of the machine learning model was visualized, and derive improvement priorities of financial variables that increase the possibility of bankruptcy of companies. Third, the scope of application of the algorithm in future research was expanded by confirming the possibility of using SMOTE and LIME through case application.
악성 게시글 및 댓글의 위험성에 대한 사회적 경각심이 높아지고 있는 상황 속에서, 인터넷 포 털 사이트와 SNS 등은 악성 게시글 및 댓글 등 을 AI를 통하여 필터링하는 기능을 도입하고 있 다. 그 과정 속에서 AI가 어떠한 기준에 따라 필 터링을 하는지 구체적인 내용이 공개되지 않아 이용자들의 반발이 계속되고 있다. AI 필터링의 확산은 불가피하게 이용자의 표 현의 자유, 알 권리를 제약하는 결과로 이어진다. 특히 AI 필터링은 인간에 의한 필터링과 달리 필 터링 결과에 이른 근거가 무엇인지 인간이 이해 할 수 있는 방법으로 인간에게 전달하는 것이 어 렵다는 점에 특징이 있다. 설명가능 인공지능(XAI)은 이용자에게 시스 템의 개별 의사결정에 대한 설명을 제공하고, 이 용자가 AI 시스템의 전반적인 강점 및 약점을 이 해하도록 도와주는 기술이며, 미국 방위고등연구 계획국(DARPA)의 주도하에 연구가 진행되고 있다. XAI는 다양한 분야에서 이용자로부터 신뢰 를 얻고 사회적 수용을 위한 공감대를 형성하는 수단이 될 것으로 예상된다. EU의 일반개인정보보호규정(GDPR)은 정보 주체들이 인공지능 알고리듬이 어떻게 결과를 도 출하는지에 대한 설명을 요구하는 근거규범을 포 함하고 있다. 설명요구권과 자동화된 의사결정을 제한할 권리를 규정함으로써 정보주체의 기본 권 리를 보장하기 위한 규제 메카니즘을 구축하였다. 이로써 XAI 개발과 설계를 위한 노력이 긴요한 과제가 되었다. GDPR의 AI 규제 메카니즘 구축이 기술 발전 을 저해하는 효과를 낳을 수 있다는 부정론도 제 기되고 있다. 하지만 AI 알고리듬의 오류 발생 가 능성이 상존하므로, AI 필터링의 신뢰도를 확보 하기 위한 근거 설명의 필요성이 크다는 점에서 AI의 알고리듬 도출 결과의 근거를 요구할 수 있 는 입법이 긴요하다고 볼 수 있다.