This study develops a model to determine the input rate of the chemical for coagulation and flocculation process (i.e. coagulant) at industrial water treatment plant, based on real-world data. To detect outliers among the collected data, a two-phase algorithm with standardization transformation and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is applied. In addition, both of the missing data and outliers are revised with linear interpolation. To determine the coagulant rate, various kinds of machine learning models are tested as well as linear regression. Among them, the random forest model with min-max scaled data provides the best performance, whose MSE, MAPE, R2 and CVRMSE are 1.136, 0.111, 0.912, and 18.704, respectively. This study demonstrates the practical applicability of machine learning based chemical input decision model, which can lead to a smart management and response systems for clean and safe water treatment plant.
Peak load rate(i.e., maximum daily flow/average daily flow) has not been considered for industrial water demand planning in Korea to date, while area unit method based on average daily flow has been applied to decide capacity of industrial water treatment plants(WTPs). Designers of industrial WTPs has assumed that peak load would not exist if operation rate of factories in industrial sites were close to 100%. However, peak load rates were calculated as 1.10~2.53 based on daily water flow from 2009 to 2014 for 9 industrial WTPs which have been operated more than 9 years(9-38 years). Furthermore, average operation rates of 9 industrial WTPs was less than 70% which means current area unit method has tendency to overestimate water demand. Therefore, it is not reasonable to consider peak load for the calculation of water demand under current area unit method application to prevent overestimation. However, for the precise future industrial water demand calculation more precise data gathering for average daily flow and consideration of peak load rate are recommended.