기후변화와 식품공급망의 복잡성 증대로 식품 위해요소 의 발생 경로와 패턴이 다변화됨에 따라, 과학적 예측과 선 제적 개입이 가능한 예방형 식품안전 관리체계의 필요성이 대두되고 있다. 본 연구는 기후·환경 요인이 식품 위해요소 에 미치는 영향을 분석함으로써, 기후 민감성이 높은 위해 요소를 식별하고 예측 가능성과 주요 환경인자를 도출하였 다. 아울러 국내외 데이터 기반 위해예측 시스템의 운영 사 례를 비교·분석함으로써, 식품위해예측센터의 실질적 운영 과 역할을 위한 발전방향을 제시하였다. 본 연구를 통해 향 후 식품위해예측센터가 식품안전 정책의 과학화와 지능화 를 이끄는 전략적 플랫폼으로 기능하고, 예방 중심의 관리 체계로의 전환을 유도할 수 있도록 실효적 토대와 정책적 방향성을 제공하고자 한다.
Structures compromised by a seismic event may be susceptible to aftershocks or subsequent occurrences within a particular duration. Considering that the shape ratios of sections, such as column shape ratio (CSR) and wall shape ratio (WSR), significantly influence the behavior of reinforced concrete (RC) piloti structures, it is essential to determine the best appropriate methodology for these structures. The seismic evaluation of piloti structures was conducted to measure seismic performance based on section shape ratios and inter-story drift ratio (IDR) standards. The diverse machine-learning models were trained and evaluated using the dataset, and the optimal model was chosen based on the performance of each model. The optimal model was employed to predict seismic performance by adjusting section shape ratios and output parameters, and a recommended approach for section shape ratios was presented. The optimal section shape ratios for the CSR range from 1.0 to 1.5, while the WSR spans from 1.5 to 3.33, regardless of the inter-story drift ratios.
본 연구는 유럽연합(European Union, EU)의 디지털 서비스법(Digital Services Act, DSA)과 브뤼셀 효과(Brussels effect)가 X(舊 Twitter) 플랫폼에 미친 영향을 데이터 분석을 통해 평가한다. DSA는 디지털 플 랫폼에 대한 규제 강화와 콘텐츠 관리의 투명성을 요구하며, X는 이를 통해 불법 콘텐츠와 혐오 발언에 대한 처리 방식을 개선하고 있다. 본 연구는 2023년부터 발행된 DSA의 투명성 보고서를 기반으로, 국가별 콘 텐츠 조정 효율성과 자동화 및 수동 검토 시스템의 성과를 분석한다. 이 를 위해, 데이터 수집 및 전처리를 거쳐 Python을 활용한 통계적 분석 을 적용하였다. 또한, 유럽 국가별로 발생한 집행 차이와 그로 인한 문제 점을 살펴보고, 글로벌 디지털 규제의 확산 가능성에 대한 정책적 시사 점을 제시한다.
Abstract Handling imbalanced datasets in binary classification, especially in employment big data, is challenging. Traditional methods like oversampling and undersampling have limitations. This paper integrates TabNet and Generative Adversarial Networks (GANs) to address class imbalance. The generator creates synthetic samples for the minority class, and the discriminator, using TabNet, ensures authenticity. Evaluations on benchmark datasets show significant improvements in accuracy, precision, recall, and F1-score for the minority class, outperforming traditional methods. This integration offers a robust solution for imbalanced datasets in employment big data, leading to fairer and more effective predictive models.
In order to predict the process window of laser powder bed fusion (LPBF) for printing metallic components, the calculation of volumetric energy density (VED) has been widely calculated for controlling process parameters. However, because it is assumed that the process parameters contribute equally to heat input, the VED still has limitation for predicting the process window of LPBF-processed materials. In this study, an explainable machine learning (xML) approach was adopted to predict and understand the contribution of each process parameter to defect evolution in Ti alloys in the LPBF process. Various ML models were trained, and the Shapley additive explanation method was adopted to quantify the importance of each process parameter. This study can offer effective guidelines for fine-tuning process parameters to fabricate high-quality products using LPBF.
Truck no-show behavior has posed significant disruptions to the planning and execution of port operations. By delving into the key factors that contribute to truck appointment no-shows and proactively predicting such behavior, it becomes possible to make preemptive adjustments to port operation plans, thereby enhancing overall operational efficiency. Considering the data imbalance and the impact of accuracy for each decision tree on the performance of the random forest model, a model based on the Borderline Synthetic Minority Over-Sampling Technique and Weighted Random Forest (BSMOTE-WRF) is proposed to predict truck appointment no-shows and explore the relationship between truck appointment no-shows and factors such as weather conditions, appointment time slot, the number of truck appointments, and traffic conditions. In order to illustrate the effectiveness of the proposed model, the experiments were conducted with the available dataset from the Tianjin Port Second Container Terminal. It is demonstrated that the prediction accuracy of BSMOTE-WRF model is improved by 4%-5% compared with logistic regression, random forest, and support vector machines. Importance ranking of factors affecting truck no-show indicate that (1) The number of truck appointments during specific time slots have the highest impact on truck no-show behavior, and the congestion coefficient has the secondhighest impact on truck no-show behavior and its influence is also significant; (2) Compared to the number of truck appointments and congestion coefficient, the impact of severe weather on truck no-show behavior is relatively low, but it still has some influence; (3) Although the impact of appointment time slots is lower than other influencing factors, the influence of specific time slots on truck no-show behavior should not be overlooked. The BSMOTE-WRF model effectively analyzes the influencing factors and predicts truck no-show behavior in appointment-based systems.
The importance of Structural Health Monitoring (SHM) in the industry is increasing due to various loads, such as earthquakes and wind, having a significant impact on the performance of structures and equipment. Estimating responses is crucial for the effective health management of these assets. However, using numerous sensors in facilities and equipment for response estimation causes economic challenges. Additionally, it could require a response from locations where sensors cannot be attached. Digital twin technology has garnered significant attention in the industry to address these challenges. This paper constructs a digital twin system utilizing the Long Short-Term Memory (LSTM) model to estimate responses in a pipe system under simultaneous seismic load and arbitrary loads. The performance of the data-driven digital twin system was verified through a comparative analysis of experimental data, demonstrating that the constructed digital twin system successfully estimated the responses.
The construction industry stands out for its higher incidence of accidents in comparison to other sectors. A causal analysis of the accidents is necessary for effective prevention. In this study, we propose a data-driven causal analysis to find significant factors of fatal construction accidents. We collected 14,318 cases of structured and text data of construction accidents from the Construction Safety Management Integrated Information (CSI). For the variables in the collected dataset, we first analyze their patterns and correlations with fatal construction accidents by statistical analysis. In addition, machine learning algorithms are employed to develop a classification model for fatal accidents. The integration of SHAP (SHapley Additive exPlanations) allows for the identification of root causes driving fatal incidents. As a result, the outcome reveals the significant factors and keywords wielding notable influence over fatal accidents within construction contexts.
최근 수십 년 동안, 데이터는 기업 조직 경영의 핵심 요소로 부상하였다. 많은 조직들이 데이터를 활용 하여 전략적인 의사결정을 내리고 시장 변화에 적극적으로 대응하고 있다. 이러한 상황에서 본 연구는 데이터 기반 의사결정 조직 운영과 그에 영향을 미치는 요인을 살펴보고자 한다. 상시적 디지털 전환이 일어나고 있는 현대에 데이터 중심 의사결정은 조직의 성과 향상에 매우 중요한 역할을 한다. 그러나 데 이터 기반 의사결정 조직에 영향을 미치는 선행 요인과 실제로 기업 내에서 데이터 기반 의사결정이 어떻게 일어나는지에 대한 연구는 아직 많이 부족한 실정이다. 본 연구는 기업의 밸류체인 디지털화 정도가 데이터 기반 의사결정 조직 구축에 중요한 영향을 미칠 것임을 가설로 설정하고, 이를 국내 기업 임직원 1,059명을 대상으로 한 설문응답 데이터로 검증하였다. 또한, 본 연구는 데이터 분석 능력을 포함한 디지 털 역량을 갖춘 인재가 데이터 중심 의사결정 조직에 중요한 환경적 요건으로 작용할 수 있음을 고려하 여, 기업의 밸류체인 디지털화와 데이터 중심 의사결정 조직 구축 간의 관계에 디지털 인재 준비도가 미 치는 조절효과를 가설로 설정하고 통계적으로 검증하였다. 본 연구의 결과는 데이터 중심 의사결정 조직 형성과 운영에 대한 이해를 넓히고 기업 조직이 데이터를 효과적으로 활용하여 의사결정을 내리는 과정 에 대한 유용한 시사점을 제공할 수 있다. 실무적 측면에서는 기업들이 자신의 데이터 전략을 개발하고 구현하는 데 중요한 시사점을 제공할 수 있을 것으로 기대한다.
PURPOSES : The objective of this study is to develop the data driven pavement condition index by considering the traffic and climatic characteristics in Incheon city. METHODS : The Incheon pavement condition index (IPCI) was proposed using the weighted sum concept with standardization and coefficient of variation for measured pavement performance data, such as crack rate, rut depth, and International Roughness Index (IRI). A correlation study between the National Highway Pavement Condition Index (NHPCI) and Seoul Pavement Condition Index (SPI) was conducted to validate the accuracy of the IPCI. RESULTS : The equation for determining the IPCI was developed using standardization and the coefficient of variation for the crack rate, rut depth, and IRI collected in the field. It was found from the statistical analysis that the weight factors of the IPCI for the crack rate were twice as high as those for the rut depth and IRI. It was also observed that IPCI had a close correlation with the NHPCI and SPI, albeit with some degree of scattering. This correlation study between the NHPCI and SPI indicates that the existing pavement condition index does not consider the asymmetry of the original measured data. CONCLUSIONS : The proposed pavement condition provides an index value that considers the characteristics of the original raw data measured in the field. The developed pavement condition index is extensively used to determine the timing and method of pavement repair, and to establish pavement maintenance and rehabilitation strategies in Incheon.
The purpose of this study was to analyze six English as a Foreign Language (EFL) learners’ trajectories of discriminating near-synonyms in a data-driven learning task. Since the learners find it considerably difficult to learn subtle meaning differences of near-synonyms, corpuscorpuscorpuscorpuscorpuscorpus-based data-driven learning may provide an opportunity for them to tackle their difficulties. The study materials guided the learners to identify the differences between the four pairs of near-synonyms, categorize the concordance lines based on their findings, and generalize the findings. The six participants had notably different trajectories of discriminating near-synonyms. The qualitative analysis of the trajectories showed a tendency that the intermediate learners focused on the meanings and found the correct answer without knowing the core meaning, and the advanced learners moved further to attend to structural differences and sometimes tested their previous knowledge on the concordance data. This study implies the need for careful guidance, collaborative group works, and strategy teaching in data-driven learning tasks.
This paper proposed data driven techniques to forecast the time point of water management of the water reservoir without measuring manganese concentration with the empirical data as Juam Dam of years of 2015 and 2016. When the manganese concentration near the surface of water goes over the criteria of 0.3mg/l, the water management should be taken. But, it is economically inefficient to measure manganese concentration frequently and regularly. The water turnover by the difference of water temperature make manganese on the floor of water reservoir rise up to surface and increase the manganese concentration near the surface. Manganese concentration and water temperature from the surface to depth of 20m by 5m have been time plotted and exploratory analyzed to show that the water turnover could be used instead of measuring manganese concentration to know the time point of water management. Two models for forecasting the time point of water turnover were proposed and compared as follow: The regression model of CR20, the consistency ratio of water temperature, between the surface and the depth of 20m on the lagged variables of CR20 and the first lag variable of max temperature. And, the Box-Jenkins model of CR20 as ARIMA (2, 1, 2).
Recent headlines predict that artificial intelligence, machine learning, predictive analytics and other aspects of cognitive computing will be the next fundamental drivers of economic growth (Brynjolfsson & McAfee, 2017). We have evidenced several success stories in the recent years, such as those of Google and Facebook, wherein novel business opportunities have evolved based on data-driven business innovations. Our directional poll among companies, however, reveals that at present, only few companies have the keys to successfully harness these possibilities. Even fever companies seem to be successful in running profitable business based on data-driven business innovations. Company’s capability to create data-driven business relates to company’s overall capability to innovate. Therefore, this research builds a conceptual model of barriers to data-driven business innovations and proposes that a deeper understanding of innovation barriers can assist companies in becoming closer to the possibilities that data-driven business innovations can enable. As Hadjimanolis (2003) suggests, the first step in overcoming innovation barriers is to understand such barriers. Consequently, we identify technology-related, organizational, environmental and people-related i.e. attitudinal barriers and examine how these relate to company’s capability to create data-driven business innovations. Specifically, technology-related barriers may originate from the company’s existing practices and predominant technological standards. Organizational barriers reflect the company’s inability to integrate new patterns of behavior into the established routines and practices (Sheth & Ram, 1987). Environmental barriers refer to various types of hampering factors that are external to a company. Environmental barriers are caused by the company’s external environment and thus company has relatively limited possibilities to influence and overcome such factors. Attitudinal barriers are people-related perceptual barriers that can be studied at the individual level, and if necessary, separately for managers and employees (Hadjimanolis, 2003). Future research will pursue to build an empirical model to examine how these different barriers are related to company’s capability to create business based on data-driven innovations.