본 연구에서는 비산먼지 농도를 평가하기 위한 영향 요인인 먼지부하량(Silt loading, sL)에 대한 연구로 노면에 쌓여있는 먼지 수집 시 효율적인 방법을 제시하기 위해 실험적 데이터 수집과 시각화를 통해 위치별 특성에 따른 먼지 분포량과 효율적인 먼지 수집 위치 를 분석하고자 하였다. 기존의 미국 EPA(Environmental Protection Agency)에서는 도로 전구간을 샘플링하기에 어려움이 있어 구간별 교 차로 길이(2.4km)를 기준으로 샘플링 위치를 제시하거나 1km 이하 구간에서는 2개를 샘플링하도록 제시하고 있다. 하지만 국내 실정 에 적용하기에는 교차로 사이 간격이 너무 넓거나, 샘플링 개수가 적은 등 한계점을 가지고 있다. 이에 본 연구에서는 청소기의 길이 0.3m에 따라 3m(0.3m X 10회) 샘플링 기법을 통해 25m와 100m 구간을 대표할 수 있는 위치를 제시해주는 것을 목표로 하고 있으며, 이때 시료를 채취하여 통계분석과 클러스터링 분석을 통해 샘플링 위치를 선정하고자 하였다. 또한 샘플링 위치에 따른 검증을 위해 서 도로 먼지 부하량과 비산먼지와의 상관관계를 정량적으로 평가하였다. 이때 먼저 sL의 양에 따른 비산먼지의 농도 측정은 도심부 제한속도에 따라 50km/h의 속도로 주행하는 조건에서 측정되었으며, 측정차량을 통해 수집된 GPS 좌표를 활용하여 도로 먼지 농도의 변화를 정량적으로 분석하였다. 분석 결과, 먼지 부하량(sL)이 농도가 높을수록 도로 먼지 농도가 증가하는 경향이 나타났으며, 이러한 상관관계는 먼지가 많을수록 공기중으로 비산되는 먼지의 양이 많은 것에 기인한 것으로 분석되었고 이때 측정한 전 구간에서 sL과 비산먼지 농도 간의 높은 상관 관계(상관계수 0.76)가 확인되었다. 추가적으로, 각 시료 채취 지점에서의 sL의 변화가 도로 먼지 농도에 미치는 영향을 평가하기 위해 K-평균 클러스터링 기법을 사용하였다. 클러스터링 결과, 최적의 샘플링 지점이 25m 구간 내에서는 3개, 100m 구간 안에서는 5개의 샘플링 위치로 대표값을 띄는 것으로 도출되었으며 비산먼지 농도의 변화와도 일치하는 것을 보였다. 이러한 방법을 통해 도로 먼지 샘플링의 신뢰성을 높일 수 있었으며, 도로 먼지의 특성을 보다 정확하게 분석할 수 있었고, 인력 수집에 따른 시간적, 공간적인 한계 를 해결할 수 있을 것으로 판단된다. 또한 이는 향후 비산먼지 측정 차량 제작 연구의 기초 자료로 활용될 수 있을 것이다.
This study utilizes association rule learning and clustering analysis to explore the co-occurrence and relationships within ecosystems, focusing on the endangered brackish-water snail Clithon retropictum, classified as Class II endangered wildlife in Korea. The goal is to analyze co-occurrence patterns between brackish-water snails and other species to better understand their roles within the ecosystem. By examining co-occurrence patterns and relationships among species in large datasets, association rule learning aids in identifying significant relationships. Meanwhile, K-means and hierarchical clustering analyses are employed to assess ecological similarities and differences among species, facilitating their classification based on ecological characteristics. The findings reveal a significant level of relationship and co-occurrence between brackish-water snails and other species. This research underscores the importance of understanding these relationships for the conservation of endangered species like C. retropictum and for developing effective ecosystem management strategies. By emphasizing the role of a data-driven approach, this study contributes to advancing our knowledge on biodiversity conservation and ecosystem health, proposing new directions for future research in ecosystem management and conservation strategies.
PURPOSES : Local governments in Korea, including Incheon city, have introduced the pavement management system (PMS). However, the verification of the repair time and repair section of roads remains difficult owing to the non-existence of a systematic data acquisition system. Therefore, data refinement is performed using various techniques when analyzing statistical data in diverse fields. In this study, clustering is used to analyze PMS data, and correlation analysis is conducted between pavement performance and influencing factors.
METHODS : First, the clustering type was selected. The representative clustering types include K-means, mean shift, and density-based spatial clustering of applications with noise (DBSCAN). In this study, data purification was performed using DBSCAN for clustering. Because of the difficulty in determining a threshold for high-dimensional data, multiple clustering, which is a type of DBSCAN, was applied, and the number of clustering was set up to two. Clustering for the surface distress (SD), rut depth (RD), and international roughness index (IRI) was performed twice using the number of frost days, the highest temperature, and the average temperature, respectively.
RESULTS : The clustering result shows that the correlation between the SD and number of frost days improved significantly. The correlation between the maximum temperature factor and precipitation factor, which does not indicate multicollinearity, improved. Meanwhile, the correlation between the RD and highest temperature improved significantly. The correlation between the minimum temperature factor and precipitation factor, which does not exhibit multicollinearity, improved considerably. The correlation between the IRI and average temperature improved as well. The correlation between the low- and high-temperature precipitation factors, which does not indicate multicollinearity, improved.
CONCLUSIONS : The result confirms the possibility of applying clustering to refine PMS data and that the correlation among the pavement performance factors improved. However, when applying clustering to PMS data refinement, the limitations must be identified and addressed. Furthermore, clustering may be applicable to the purification of PMS data using AI.
본 연구는 초등학생의 골연령에 따라 군집화 시켜 각 군집 그룹의 체격, 체력 및 골성숙도를 분석하고 자료 분석을 통해 초등학생들의 균형적인 발달을 위한 기초자료를 제공하는 데 있다. 연구대상은 8세∼13세에 해당하는 2243명을 대상으로 하였으며 골성숙도 산출을 위해 X-ray필름을 촬영한 후 TW3 방법 점수 환산표에 적용시켜 골성숙도를 산출했다. 신장계(Hanebio, Korea, 2021)와 Inbody 270 (Biospace, Korea, 2019)를 사용하여 총 2개의 체격 요소를 측정하였으며, 체력은 근력(악력), 평형성(외발 서기), 민첩성(플랫테핑), 순발력(제자리멀리뛰기), 유연성(좌전굴), 근지구력(윗몸일으키기), 심폐지구력(셔 틀런)으로 총 7개 체력 요소의 종목을 측정하였다. 자료처리 방법은 SPSS PC/Program(Version 26.0)과 Britics Studio Tool을 이용하여 K-Means 클러스터링 기법, 교차분석, 일원변량분석(One-Way ANOVA) 을 실시하였으며, p< .05 수준에서 유의한 것으로 간주하였다. 본 연구의 결과는 다음과 같다. 첫째, 미숙, 보통, 조숙의 3가지 골성숙도를 사용하여 군집화한 결과, 군집 1(미숙)은 근력, 평형성, 민첩성에서 높게 나 타났다. 군집 2(보통)는 유연성에서 낮게 나타났으며, 군집 3(조숙)은 근력에서 높게 나타났다. 둘째, 초등 학생의 개인특성별 군집화에 따른 체격 차이를 분석한 결과, 신장, 체중, 체지방률 모두 군집 3(조숙)이 높 게 나타났다. 셋째, 초등학생의 개인특성별 군집화에 따른 체력 차이를 분석한 결과, 악력검사(좌, 우)는 군 집 3(조숙)이 높게 나타났고 외발서기의 경우 군집 1(미숙)이 높게 나타났으며, 제자리멀리뛰기의 경우 군 집 3(조숙)이 높게 나타났다.
A trend analysis of research articles in a field of knowledge is significant because it can help in finding out the structural characteristics of the field and the future direction of research through observing change in a time series. We identified the structural characteristics and trends in text data (keywords) gathered from research articles which in itself is an important task in various research areas. The titles and keywords were crawled from research articles published from 2016 to 2018 in the Research Journal of the Costume Culture (RJCC), one of the representative Korean journal in the field of clothing and textile. After we extracted data comprising English titles and keywords from 195 published articles, we transformed it into a 1-mode matrix. We used measures from network analysis (i.e., link, strength, and degree centrality) for evaluating meaningful patterns and trends in the research on clothing and textile. NodeXL was used for visualizing the semantic network. This study observed change in the clothing and textile research trend. In addition to covering the core areas of the field, the subjects of research have been diversifying with every passing year and have evolved onto a developmental direction. The most studied area in articles published by the RJCC was fashion retailing/consumer psychology while aesthetic/historic and fashion industry/policy studies were covered to a more limited extent. We observed that most of the studies reflecting the identity of RJCC share subject keywords to a significant extent.
Understanding the classification of malocclusion is a crucial issue in Orthodontics. It can also help us to diagnose, treat, and understand malocclusion to establish a standard for definite class of patients. Principal component analysis (PCA) and k-means algorithms have been emerging as data analytic methods for cephalometric measurements, due to their intuitive concepts and application potentials. This study analyzed the macro- and meso-scale classification structure and feature basis vectors of 1020 (415 male, 605 female; mean age, 25 years) orthodontic patients using statistical preprocessing, PCA, random matrix theory (RMT) and k-means algorithms. RMT results show that 7 principal components (PCs) are significant standard in the extraction of features. Using k-means algorithms, 3 and 6 clusters were identified and the axes of PC1~3 were determined to be significant for patient classification. Macro-scale classification denotes skeletal Class I, II, III and PC1 means anteroposterior discrepancy of the maxilla and mandible and mandibular position. PC2 and PC3 means vertical pattern and maxillary position respectively; they played significant roles in the meso-scale classification. In conclusion, the typical patient profile (TPP) of each class showed that the data-based classification corresponds with the clinical classification of orthodontic patients. This data-based study can provide insight into the development of new diagnostic classifications.
The purpose of this paper is to find out how each districts(Gu) of Seoul are related based on the apartment price trends. All the data used in this paper comes from a public data sources, Seoul apartments transaction data provided by ‘Ministry of Land Infrastructure and Transport Korea’ and the apartments properties from NAVER’s real estate service. To analyze the similarities between the price trends of each apartments, this study uses FastDTW algorithm which is quite popular in time series analysis domain. After figured out the distance matrix from FastDTW, this study uses Hierarchical Clustering algorithm and Chi-squared test to compare each districts’ relationship. The analysis result shows that which districts in Seoul are similar and which districts are not.
Forecasting of box office performance after a film release is very important, from the viewpoint of increase profitability by reducing the production cost and the marketing cost. Analysis of psychological factors such as word-of-mouth and expert assessment is essential, but hard to perform due to the difficulties of data collection. Information technology such as web crawling and text mining can help to overcome this situation. For effective text mining, categorization of objects is required. In this perspective, the objective of this study is to provide a framework for classifying films according to their characteristics. Data including psychological factors are collected from Web sites using the web crawling. A clustering analysis is conducted to classify films and a series of one-way ANOVA analysis are conducted to statistically verify the differences of characteristics among groups. The result of the cluster analysis based on the review and revenues shows that the films can be categorized into four distinct groups and the differences of characteristics are statistically significant. The first group is high sales of the box office and the number of clicks on reviews is higher than other groups. The characteristic of the second group is similar with the 1st group, while the length of review is longer and the box office sales are not good. The third group's audiences prefer to documentaries and animations and the number of comments and interests are significantly lower than other groups. The last group prefer to criminal, thriller and suspense genre. Correspondence analysis is also conducted to match the groups and intrinsic characteristics of films such as genre, movie rating and nation.
In this paper, we consider curriculum mining as an application of process mining in the domain of education. The basic objective of the curriculum mining is to construct a registration pattern model by using logs of registration data. However, subject registration patterns of students are very unstructured and complicated, called a spaghetti model, because it has a lot of different cases and high diversity of behaviors. In general, it is typically difficult to develop and analyze registration patterns. In the literature, there was an effort to handle this issue by using clustering based on the features of students and behaviors. However, it is not easy to obtain them in general since they are private and qualitative. Therefore, in this paper, we propose a new framework of curriculum mining applying K-means clustering based on subject attributes to solve the problems caused by unstructured process model obtained. Specifically, we divide subject’s attribute data into two parts : categorical and numerical data. Categorical attribute has subject name, class classification, and research field, while numerical attribute has ABEEK goal and semester information. In case of categorical attribute, we suggest a method to quantify them by using binarization. The number of clusters used for K-means clustering, we applied Elbow method using R-squared value representing the variance ratio that can be explained by the number of clusters. The performance of the suggested method was verified by using a log of student registration data from an ‘A university’ in terms of the simplicity and fitness, which are the typical performance measure of obtained process model in process mining.
Platform-based product family design is recognized as an effective method to satisfy the mass customization which is a current market trend. In order to design platform-based product family successfully, it is the key work to define a good product platform, which is to identify the common modules that will be shared among the product family. In this paper the clustering analysis using dendrogram is proposed to capture the common modules of the platform. The clustering variables regarding both marketing and engineering sides are derived from the view point of top-down product development. A case study of a cordless drill/drive product family is presented to illustrate the feasibility and validity of the overall procedure developed in this research.
효율적인 악취관리를 위해서는 민원지역에서 발생한 악취를 분류하고, 그 악취원을 분 석해야 한다. 이를 위해서는 민원지역에서 발생한 악취를 나타낼 수 있는 악취대표패턴과 악취원의 냄새가 필요하다. 이에 본 논문에서는 민원지역의 악취분류를 위해 k-mean 알고리즘을 이용하여 악취데이 터에 대한 군집화를 수행하였다. 그 결과 생성된 악취대표패턴과 미리 측정된 악취원별 냄새와의 유사도를 비교하여 악취에 대한 분류를 수행하였다. 또한, 대기 중에서 여러 악 취가 섞였을 경우를 고려하여 non-negative least square를 이용하여 해당 악취에 대해 책임 이 있는 하나 이상의 악취원과 기여도를 추적하였다. 이러한 본 연구의 성과는 악취 관련 민원해결에 기여할 것으로 사료된다.
이 연구는 입목축적과 산림관리정책 간의 전이함수(transfer function model)를 도출하기 위한 선행연구로, 입목축적변화를 유도하는 산림사업 간 다중공선성의 문제를 해결하기 위해 주성분 분석을 실시하였다. 분석자료는 9개의 대표적인 산림관리정책에 대해 1977~2008년까지 32년간의 연도별 시계열데이터를 활용하였으며, 분석 결과 추출된 3개의 주성분에 대한 전체 설명력은 91.4%로 상당히 높게 나타났다. 요약된 3개의 성분은 양호한 산림관리·병해충관리·산불발생이라는 새로운 변수명으로 개념화하였다.
It is known that traffic volume, which is representative of live bridge, has a significant effect on service level (LOS: level of service) and structural displacement of bridge. In particular, if the traffic volume is distributed in the traffic congestion time zone on the toll road, it is possible to manage efficiently in terms of customer satisfaction and maintenance management operation such as improvement of service level and structural integrity. To do this, it is necessary to induce the distribution of traffic in the congestion time zone, and it is necessary to apply the differential pricing system such as the congestion charge and the difference of toll by time. In this study, we analyzed traffic aggregation by time zone for the subdivision and advancement of the Gwangan bridge uniform fare system.