This study aimed to improve the accuracy of road pavement design by comparing and analyzing various statistical and machine-learning techniques for predicting asphalt layer thickness, focusing on regional roads in Pakistan. The explanatory variables selected for this study included the annual average daily traffic (AADT), subbase thickness, and subgrade California bearing ratio (CBR) values from six cities in Pakistan. The statistical prediction models used were multiple linear regression (MLR), support vector regression (SVR), random forest, and XGBoost. The performance of each model was evaluated using the mean absolute percentage error (MAPE) and root-mean-square error (RMSE). The analysis results indicated that the AADT was the most influential variable affecting the asphalt layer thickness. Among the models, the MLR demonstrated the best predictive performance. While XGBoost had a relatively strong performance among the machine-learning techniques, the traditional statistical model, MLR, still outperformed it in certain regions. This study emphasized the need for customized pavement designs that reflect the traffic and environmental conditions specific to regional roads in Pakistan. This finding suggests that future research should incorporate additional variables and data for a more in-depth analysis.
작물 재배 시 주요 해충 발생에 대해 한두 달 이상 앞선 계절전망이 가능하다면 농가의 해충관리 의사결정이 보다 효율적으로 이루어질 수 있을 것이다. 본 연구에서는 국내 해충 발생과 통계적으로 유의미한 원격상관관계에 있는 기후현상을 찾기 위해 Moving Window Regression (MWR) 기법을 활용하였다. 벼멸구의 발생과 비래는 장기간에 걸쳐 여러 지역에서 연속적으로 일어나는 사건이기 때문에 비슷한 시공간적 규모 를 갖는 기후현상과 통계적인 연관성을 가질 가능성이 높아 본 연구의 대상 해충으로 선택하였다. MWR 통계 분석의 반응변수로써 1983년부터 2014년까지 국내 벼멸구 발생면적 자료를 사용하였고, 10개의 기후모형에서 생산되는 10개의 기후변수를 예보 선행시간별로 추출하여 설명변 수로 사용하였다. 최종적으로 선정된 각 MWR 모형의 특정 시기와 지역의 기후변수는 연간 벼멸구 발생면적 자료와 통계적으로 유의한 상관관 계를 보였다. 결론적으로, 본 연구에서 개발한 MWR 통계 모형을 통해 국내 벼멸구 발생 위험도에 따른 선제적 대응을 위한 벼멸구 계절전망이 가능할 것으로 보인다.
The research proposes the complementary methodology using integrated hypothesis testing and confidence interval models that can be identified the statistical difference and practical equivalence. The models developed in this study can be used in the quality improvement processes such as QC story 15 steps. For the expressions of CI4LSD(Confidence Interval for Least Significant Difference) and CI4TOST(Confidence Interval for Two One-Sided Tests) are simple, quality practioners can efficiently handle them. CI4TOST models as a complement can be applied when CI4LSD models are influenced by sample size and precision.
This study is designed to prove the role and effect of ethics codes in professional societies, especially for scientists and engineers working in R&D project groups The hypotheses of influence on ethical conduct within the sample groups are tested and ana
The research interprets the principles of sampling error design for quality statistics models such as hypothesis test, interval estimation, control charts and acceptance sampling. Introducing the proper discussions of the design of significance level according to the use of hypothesis test, then it presents two methods to interpret significance by Neyman-Pearson and Fisher. Second point of the study proposes the design of confidence level for interval estimation by Bayesian confidence set, frequentist confidential set and fiducial interval. Third, the content also indicates the design of type I error and type II error considering both productivity and customer claim for control chart. Finally, the study reflects the design of producer's risk with operating charistictics curve, screening and switch rules for the purpose of purchasing and subcontraction.
Dependent models in quality statistics are classified as serially autocorrelated model, multivariate model and dependent sample model. Dependent sample model is most efficient in time and cost to obtain samples among the above models. This paper proposes to implement parametric and nonparametric models into production system depended on demand pattern. Nonparametric models have distribution free and asymptotic distribution free techniques. Quality statistical models are classified into two categories ; the number of dependent sample and the type of data. The type of data consists of nominal, ordinal, interval and ratio data. The number of dependent sample divides into 2 samples and more than 3 samples.
요크셔종과 버크셔종 교배 실험 집단을 활용하여 양적형질 유전자좌 (QTL)의 발현 특성 관련 유전 양식을 조사하였다. 총 512두의 F 자손이 F간의 65교배 조합으로부터 생산되었으며 표현형 조사 기록은 일당증제량(ADG), 평균 등지방 두께(ABF), 10번째 등뼈 부위 등지방 두께(TRF) 및 등심단면적(LEA), 최후 척추부위 등지방 두께 (LRF)였다. 125종의 유전자 표지 (microsatellite)에 대한 3세대 개체별 유전자형이 분석되었
Weather is the most influential factor for crop cultivation. Weather information for cultivated areas is necessary for growth and production forecasting of agricultural crops. However, there are limitations in the meteorological observations in cultivated areas because weather equipment is not installed. This study tested methods of predicting the daily mean temperature in onion fields using geostatistical models. Three models were considered: inverse distance weight method, generalized additive model, and Bayesian spatial linear model. Data were collected from the AWS (automatic weather system), ASOS (automated synoptic observing system), and an agricultural weather station between 2013 and 2016. To evaluate the prediction performance, data from AWS and ASOS were used as the modeling data, and data from the agricultural weather station were used as the validation data. It was found that the Bayesian spatial linear regression performed better than other models. Consequently, high-resolution maps of the daily mean temperature of Jeonnam were generated using all observed weather information.
Agricultural meteorological information is an important resource that affects farmersʼ income, food security, and agricultural conditions. Thus, such data are used in various fields that are responsible for planning, enforcing, and evaluating agricultural policies. The meteorological information obtained from automatic weather observation systems operated by rural development agencies contains missing values owing to temporary mechanical or communication deficiencies. It is known that missing values lead to reduction in the reliability and validity of the model. In this study, the hierarchical Bayesian spatio–temporal model suggests replacements for missing values because the meteorological information includes spatio–temporal correlation. The prior distribution is very important in the Bayesian approach. However, we found a problem where the spatial decay parameter was not converged through the trace plot. A suitable spatial decay parameter, estimated on the bias of root–mean–square error (RMSE), which was determined to be the difference between the predicted and observed values. The latitude, longitude, and altitude were considered as covariates. The estimated spatial decay parameters were 0.041 and 0.039, for the spatio-temporal model with latitude and longitude and for latitude, longitude, and altitude, respectively. The posterior distributions were stable after the spatial decay parameter was fixed. root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and bias were calculated for model validation. Finally, the missing values were generated using the independent Gaussian process model.
The relationship between urban spatial structures and GHG-AP integrated emissions was investigated by statistically analyzing those from 25 administrative districts of Seoul. Urban spatial structures, of which data were obtained from Seoul statistics yearbook, were classified into five categories of city development, residence, environment, traffic and economy. They were further classified into 10 components of local area, population, number of households, residential area, forest area, park area, registered vehicles, road area, number of businesses and total local taxes. GHG-AP integrated emissions were estimated based on IPCC(intergovernmental panel on climate change) 2006 guidelines, guideline for government greenhouse inventories, EPA AP-42(compilation of air pollutant emission factors) and preliminary studies. The result of statistical analysis indicated that GHG-AP integrated emissions were significantly correlated with urban spatial structures. The correlation analysis results showed that registered vehicles for GHG (r=0.803, p<0.01), forest area for AP (r=0.996, p<0.01), and park area for AP (r=0.889, p<0.01) were highly significant. From the factor analysis, three groups such as city and traffic categories, economy category and environment category were identified to be the governing factors controlling GHG-AP emissions. The multiple regression analysis also represented that the most influencing factors on GHG-AP emissions were categories of traffic and environment. 25 administrative districts of Seoul were clustered into six groups, of which each has similar characteristics of urban spatial structures and GHG-AP integrated emissions.
Three meteor-statistical forecasting models - the transfer function model, the time-series autoregressive model and the neural networks model - were tested to develop a daily forecasting model for Jejudo, where the need and demand for wind power forecasting has increased. All the meteorological observation sites in Jejudo have been classified into 6 groups using a cluster analysis. Four pairs of observation sites among them, all having strong wind speed correlation within the same meteorological group, were chosen for a model test. In the development of the wind speed forecasting model for Jejudo, it was confirmed that not only the use a wind dataset at the objective site itself, but the introduction of another wind dataset at the nearest site having a strong wind speed correlation within the same group, would enhance the goodness to fit of the forecasting. A transfer function model and a neural network model were also confirmed to offer reliable predictions, with the similar goodness to fit level.
본 연구에서는 미계측유역에 대한 준분포형 강우-유출모형을 적용하기 위한 방법으로 두 개의 다변량 통계기법인 주성분분석과 계층적 군집분석을 연계한 매개변수 지역화 기법을 제안하였다. 109개 중권역 유역에 대해 7개 유역특성인자(유역면적, 평균표고, 평균경사, 산림면적비, 포화토양수분량, 포장용수량, 영구위조점)를 추출하였으며 주성분분석을 수행한 결과 제1, 2 성분이 전체자료의 82.11%를 설명하는 것으로 나타났다. 제1성분은 유역위치, 제2성분은 유
NO2 concentration characteristics of Busan metropolitan city was analysed by statistical method using hourly NO2 concentration data(1998~2000) collected from air quality monitoring sites of the metropolitan city.
4 representative regions were selected among air quality monitoring sites of Ministry of environment. Concentration data of NO2, 5 air pollutants, and data collected at AWS was used.
Both Stepwise Multiple Regression model and ARIMA model for prediction of NO2 concentrations were adopted, and then their results were compared with observed concentration.
While ARIMA model was useful for the prediction of daily variation of the concentration, it was not satisfactory for the prediction of both rapid variation and seasonal variation of the concentration.
Multiple Regression model was better estimated than ARIMA model for prediction of NO2 concentration.
본 연구는 다음의 두 가지 목적이 있다. 첫째, 각종 실증분석에 있어서의 다중모형의 효율성에 대한 소개와, 둘째, 다중모형의 분석에 있어서 상위단계의 예측되는 가치를 측정하기 위한 새로운 통계를 소개하는 데 있다. 다중모형의 이론적 틀은 광범위하게 사용되는 기존의 1단계 모형의 통계적 문제점(이분산 등)을 보완하고, 현실을 더욱 실체적으로 파악한다는 측면에서 앞으로 지역분석의 중추적 틀로서 자리매김하리라 예상되고 있다. 본 연구는 이러한 다중모형의 효율