하수처리장의 질소 농도 예측을 위한 시뮬레이터 기반 데이터 증강과 머신러닝 구축

이서준; 이재학; 장유정; 오희경

논문 상세보기

하수처리장의 질소 농도 예측을 위한 시뮬레이터 기반 데이터 증강과 머신러닝 구축 KCI 등재

Simulator-based data augmentation and machine-learning modeling for predicting nitrogen concentrations in municipal wastewater treatment plant

이서준, 이재학, 장유정, 오희경

언어KOR
URLhttps://db.koreascholar.com/Article/Detail/447607

구독 기관 인증 시 무료 이용이 가능합니다. 4,600원

상하수도학회지 (Journal of the Korean Society of Water and Wastewater)

제39권 제6호 (2025.12)
pp.465-478

대한상하수도학회 (Korean Society Of Water And Wastewater)

초록

하수처리장의 안정성과 효율성의 향상을 위해 스마트 기술 도입이 요구되고 있으나, 운영 데이터베이스 구축에 있어 계측의 신뢰성과 연속성 확보에 어려움이 있다. 활성슬러지 모델은 하수처리장의 디지털트윈으로 활용되며, 유입수 성상이 동일하더라도 다양한 운전 조건에 대한 데이터를 생산할 수 있다. 본 연구에서는 실측 데이터와 시뮬레이터 기반 합성 데이터를 통합하여 하수처리장 질소 농도 예측 머신러닝 모델을 구축하였다. A2O 공정의 호기조를 대상으로 기체상 N2O 및 액상 NH4 + 농도를 측정하였으며, 내부반송량, 외부반송량 등 운전인자를 포함한 운영데이터베이스를 구축하고 분석하였다. 확보한 실측 데이터를 기반으로 운영 특성을 분석하고, Sumo4N 모델을 활용하여 다양한 운전 조건에서의 합성 데이터를 생성하였다. 이후 두 데이터를 통합하여 데이터 증강을 수행함으로써, 실측 데이터의 양적 한계를 보완하였다. 모델 학습을 위한 입력 변수로는 외부⋅내부 반송량, 폭기량, 온도, 유입 질소 부하, pH를 선정하였으며 호기조의 N2O, NH4 +과 방류수 TN 농도를 예측하기 위한 머신러닝 모델을 개발하였다. 모델 학습에는 Lasso Regression, Random Forest, k-NN, SVR 알고리즘을 적용하여 성능을 평가하였다. 그 결과 SVR 알고리즘이 모든 질소 성분 예측에서 가장 우수한 성능을 보였으며, 개발된 모델 모두 R² ≥ 0.75의 높은 예측 성능을 나타내었다. 이는 시뮬레이터 기반 데이터 증강을 통해 기체상 및 액상 질소의 통합 제어를 위한 머신러닝 모델 구축의 가능성을 시사한다.

Smart technologies are increasingly required to improve stability and efficiency of wastewater treatment plants(WWTPs). However, it remains challenging to ensure reliability and continuity of measurements during the establishment of operational databases. The activated sludge model can serve as a digital twin of a WWTP, capable of generating data under various operational conditions with influent characteristics remaining unchanged. In this study, machine learning (ML) models for predicting nitrogen concentrations in a WWTP were developed by integrating measured data with simulator-based synthetic data. Gaseous N2O and aqueous NH4 + concentrations were measured in the aerobic reactor of the A2O process, and an operational database including parameters such as mixed liquor return(MLR) and return activated sludge(RAS) was established and analyzed. Based on the acquired field data, operational characteristics were analyzed, and synthetic data under various operating conditions were generated using the simulator with Sumo4N model. The two datasets were subsequently integrated to perform data augmentation, thereby compensating for the limited quantity of measured data. ML models were developed to predict N2O, NH4 + and effluent TN concentrations with RAS, MLR, aeration rate, aeration rate, temperature, influent nitrogen load, and pH as input variables. Lasso Regression, Random Forest, k-Nearest Neighbors (k-NN), and Support Vector Regression (SVR) algorithms were applied to train and evaluate the model performance. As a result, the SVR algorithm demonstrated the best performance in predicting all cases, and all developed models achieved high predictive accuracy with R² ≥ 0.75. These findings suggest that simulator-based data augmentation can be a supportive tool for developing ML models to enable integrated control of gaseous and aqueous nitrogen components.

키워드

머신러닝시뮬레이터운영 데이터 증강질산화공공하수처리장 Machine learningSimulatorOperational data augmentationNitrificationMunicipal wastewater treatment plant

요약문
ABSTRACT
1. 서 론
2. 재료 및 실험방법
    2.1 J 하수처리장의 질소 성분 분석
    2.2 시뮬레이터 구성
    2.3 모델변수 최적화
    2.4 머신러닝 모델 학습용 시뮬레이션 데이터 생성
    2.5 머신러닝 알고리즘
    2.6 머신러닝 모델링 및 성능평가
3. 결과 및 고찰
    3.1 운영 DB 분석
    3.2 보정된 모델 변수와 생산 데이터 분석
    3.3 질소 성분별 예측모델 결과 및 비교 분석
    3.4 모델 입력변수 중요도 분석
4. 결 론
사 사
References

저자

이서준(서울시립대학교 환경공학과) | Seojun Lee (Department of Environmental Engineering, University of Seoul)
이재학(서울시립대학교 환경공학과) | Jaehak Lee (Department of Environmental Engineering, University of Seoul)
장유정(서울시립대학교 환경공학과) | Youjung Jang (Department of Environmental Engineering, University of Seoul)
오희경(서울시립대학교 환경공학과) | Heekyong Oh (Department of Environmental Engineering, University of Seoul) Corresponding author

같은 권호 다른 논문