PURPOSES : In this study, a preliminary study on the optimal clustering techniques for the preprocessing of pavement management system (PMS) data was conducted using K-means and mean-shift techniques to improve the correlation between the dependent and independent variables of the pavement performance model. METHODS : The PMS data of Jeju Island was preprocessed using the K-means and mean-shift algorithms. In the case of the K-means method, the elbow method and silhouette score were used to determine the optimal number of clusters (K). Moreover, in the case of the mean-shift method, Scott’s rule of thumb and Silverman’s rule of thumb were used to determine the optimal cluster bandwidth. RESULTS : The optimal cluster sets were selected for the rut depth (RD), annual average daily traffic (AADT), and annual maximum temperature (AMT) for each clustering technique, and their similarities with the original data were investigated. Additionally, the correlation improvement between the dependent and independent variables were investigated by calculating the clustering score (CS). Consequently, the K-means method was selected as the optimal clustering technique for the preprocessing of PMS data. The K-means method improved the correlations of more variables with the dependent variable compared to the mean-shift method. The correlations of the variables related to high temperature—such as the annual temperature change, summer days, and heat wave days—were improved in the case wherein the AMT, a climate factor, was used as an independent variable in the K-means clustering method. CONCLUSIONS : The applicability of the clustering methods to preprocessing of PMS data was identified in this study. Improvements in the pavement performance prediction model developed using traditional statistical methods may be identified by developing a model using clustering techniques in a future study.