논문 상세보기

데이터 클러스터링을 위한 혼합 시뮬레이티드 어닐링 KCI 등재

Hybrid Simulated Annealing for Data Clustering

  • 언어KOR
  • URLhttps://db.koreascholar.com/Article/Detail/328156
구독 기관 인증 시 무료 이용이 가능합니다. 4,000원
한국산업경영시스템학회지 (Journal of Society of Korea Industrial and Systems Engineering)
한국산업경영시스템학회 (Society of Korea Industrial and Systems Engineering)
초록

Data clustering determines a group of patterns using similarity measure in a dataset and is one of the most important and difficult technique in data mining. Clustering can be formally considered as a particular kind of NP-hard grouping problem. K-means algorithm which is popular and efficient, is sensitive for initialization and has the possibility to be stuck in local optimum because of hill climbing clustering method. This method is also not computationally feasible in practice, especially for large datasets and large number of clusters. Therefore, we need a robust and efficient clustering algorithm to find the global optimum (not local optimum) especially when much data is collected from many IoT (Internet of Things) devices in these days. The objective of this paper is to propose new Hybrid Simulated Annealing (HSA) which is combined simulated annealing with K-means for non-hierarchical clustering of big data. Simulated annealing (SA) is useful for diversified search in large search space and K-means is useful for converged search in predetermined search space. Our proposed method can balance the intensification and diversification to find the global optimal solution in big data clustering. The performance of HSA is validated using Iris, Wine, Glass, and Vowel UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KSAK (K-means+SA+K-means) and SAK (SA+K-means) are better than KSA(K-means+SA), SA, and K-means in our simulations. Our method has significantly improved accuracy and efficiency to find the global optimal data clustering solution for complex, real time, and costly data mining process.

저자
  • 김성수(강원대학교 시스템경영공학과) | Sung-Soo Kim (Department of System & Management Engineering, Kangwon National University) Corresponding author
  • 백준영(강원대학교 시스템경영공학과) | Jun-Young Baek (Department of System & Management Engineering, Kangwon National University)
  • 강범수(강원대학교 시스템경영공학과) | Beom-Soo Kang (Department of System & Management Engineering, Kangwon National University)