검색결과 - koreascholar

2026.03 구독 인증기관·개인회원 무료

콘크리트공학에서 데이터 증강을 위한 대규모 언어 모델: 바이오차 활용 시멘트 대체 머신러닝 연구

Large Language Models for Data Augmentation in Concrete Engineering: A Machine Learning Study on Biochar Cement Replacement

풀룬쇼아비데미바시루, 파우델아시쉬, 아이작파코야, 김승원, 박철우

한국도로학회 학술대회 발표논문 초록집 2026 한국도로학회 봄학술대회 발표논문 초록집 pp.75-76 한국도로학회

The application of machine learning in concrete technology has expanded rapidly, yet its reliability is often constrained by limited experimental data, heterogeneous testing conditions, and inconsistencies across published studies. This study investigates the integration of machine learning and synthetic data augmentation to predict the compressive strength of concrete incorporating biochar as a partial replacement for cement. An experimental dataset was compiled from peer-reviewed journal articles indexed in Web of Science, focusing on biochar-modified concrete mixtures. Input variables included cement content, fine and coarse aggregates, biochar dosage, water to binder ratio, superplasticizer content, and curing age, with compressive strength as the target variable. Extreme Gradient Boosting was adopted due to its strong performance on nonlinear tabular data. Model performance was evaluated using the mean absolute error (MAE), mean squared error (MSE), and coefficient of determination (R²), alongside five-fold cross-validation. Hyperparameter optimization was performed using Optuna. To address data scarcity, a synthetic dataset of 1000 samples was generated using ChatGPT. the large language model approach relied solely on natural language prompts. Only feature definitions and the target variable were provided, without exposing the original data or implementing data generation algorithms. Three modeling strategies were examined. First, model trained and tested solely on experimental data achieved a testing R² of approximately 0.91. Second, model trained on synthetic data and evaluated exclusively on experimental data showed reduced generalization, achieving a testing R² of about 0.42, indicating pronounced domain shift effects. Third, synthetic and experimental data were combined through data augmentation and jointly modeled, a testing R² of 0.93 was achieved. The result showed that the use of LLMs for augmentation improved the performance of the model.

2025.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

하수처리장의 질소 농도 예측을 위한 시뮬레이터 기반 데이터 증강과 머신러닝 구축

Simulator-based data augmentation and machine-learning modeling for predicting nitrogen concentrations in municipal wastewater treatment plant

이서준, 이재학, 장유정, 오희경

상하수도학회지 제39권 제6호 pp.465-478 대한상하수도학회

하수처리장의 안정성과 효율성의 향상을 위해 스마트 기술 도입이 요구되고 있으나, 운영 데이터베이스 구축에 있어 계측의 신뢰성과 연속성 확보에 어려움이 있다. 활성슬러지 모델은 하수처리장의 디지털트윈으로 활용되며, 유입수 성상이 동일하더라도 다양한 운전 조건에 대한 데이터를 생산할 수 있다. 본 연구에서는 실측 데이터와 시뮬레이터 기반 합성 데이터를 통합하여 하수처리장 질소 농도 예측 머신러닝 모델을 구축하였다. A2O 공정의 호기조를 대상으로 기체상 N2O 및 액상 NH4 + 농도를 측정하였으며, 내부반송량, 외부반송량 등 운전인자를 포함한 운영데이터베이스를 구축하고 분석하였다. 확보한 실측 데이터를 기반으로 운영 특성을 분석하고, Sumo4N 모델을 활용하여 다양한 운전 조건에서의 합성 데이터를 생성하였다. 이후 두 데이터를 통합하여 데이터 증강을 수행함으로써, 실측 데이터의 양적 한계를 보완하였다. 모델 학습을 위한 입력 변수로는 외부⋅내부 반송량, 폭기량, 온도, 유입 질소 부하, pH를 선정하였으며 호기조의 N2O, NH4 +과 방류수 TN 농도를 예측하기 위한 머신러닝 모델을 개발하였다. 모델 학습에는 Lasso Regression, Random Forest, k-NN, SVR 알고리즘을 적용하여 성능을 평가하였다. 그 결과 SVR 알고리즘이 모든 질소 성분 예측에서 가장 우수한 성능을 보였으며, 개발된 모델 모두 R² ≥ 0.75의 높은 예측 성능을 나타내었다. 이는 시뮬레이터 기반 데이터 증강을 통해 기체상 및 액상 질소의 통합 제어를 위한 머신러닝 모델 구축의 가능성을 시사한다.

4,600원

2025.10 KCI 등재 구독 인증기관 무료, 개인회원 유료

인공지능 데이터 증강과 환경 요인 분석을 통한 작물 표현형 예측 기법 연구

A Study on Crop Phenotype Prediction by Integrating Environmental Data Collection and AI-Based Data Augmentation Techniques

변성우, 최지호, 여욱현

생물환경조절학회지 Vol.34 No.4 pp.535-544 한국생물환경조절학회

전 세계 식량 안보는 기후 변화와 인구 증가로 인해 점점 더 위협받고 있으며, 이를 해결하기 위해서는 유전체학, 표현형 학, 인공지능을 통합한 첨단 육종 전략이 필요하다. 본 연구는 유전자형 데이터 증강과 반지도 학습을 활용하여 토마토 육종 에서의 표현형 예측 정확도를 향상시키는 것을 목표로 한다. 총 192종의 토마토 계통을 온실 환경에서 재배하며, 과중, 높 이, 너비, 경도, 당도 등 5가지 주요 형질에 대한 유전자형, 표 현형, 환경 데이터를 수집한다. 제안된 1차원 합성곱신경망 기반의 유전자형 데이터 증강 프레임워크는 원본 데이터셋을 확장하고, 라벨이 안된 데이터를 효과적으로 활용하기 위한 수도 라벨링 전략을 도입한다. 또한, 온도, 습도 등 환경 변수 는 생육 기간 동안의 통계적 특징값을 추출하여 모델 입력에 통합함으로써 재배 조건을 보다 현실적으로 반영하였다. 표 현형 예측은 트리 기반 및 딥러닝 아키텍처를 포함한 다양한 모델을 통해 수행되었으며, 서로 다른 네트워크 구조에 따른 성능을 비교 및 평가한다. 실험 결과, 유전자형 데이터 증강은 전반적으로 예측 성능을 향상시켰으며, 특히 LightGBM과 CatBoost와 같은 트리 기반 모델에서 가장 큰 개선 효과를 보 였다. 또한 최신 딥러닝 모델과의 비교 실험을 통해 제안된 접 근법의 강건성을 확인한다. 이러한 결과는 제안된 방법이 데 이터가 제한된 육종 환경에서도 실질적인 성능 향상을 달성할 수 있는 효과적인 전략임을 보여주며, 향후 멀티오믹스 및 환 경 데이터와의 통합을 통해 확장 가능한 디지털 육종 프레임 워크로 발전할 가능성을 제시한다.

4,000원

2021.12 KCI 등재 구독 인증기관 무료, 개인회원 유료

위상 최적화를 위한 생산적 적대 신경망 기반 데이터 증강 기법

GAN-based Data Augmentation methods for Topology Optimization

이승혜, 이유진, 이기학, 이재홍

한국대공간건축 논문집(구 한국공간구조학회지) 제21권 제4호 pp.39-48 한국공간구조학회

In this paper, a GAN-based data augmentation method is proposed for topology optimization. In machine learning techniques, a total amount of dataset determines the accuracy and robustness of the trained neural network architectures, especially, supervised learning networks. Because the insufficient data tends to lead to overfitting or underfitting of the architectures, a data augmentation method is need to increase the amount of data for reducing overfitting when training a machine learning model. In this study, the Ganerative Adversarial Network (GAN) is used to augment the topology optimization dataset. The produced dataset has been compared with the original dataset.

4,000원

2020.06 KCI 등재 서비스 종료(열람 제한)

영상 내 물체 검출 및 분류를 위한 소규모 데이터 확장 기법

Data Augmentation Method of Small Dataset for Object Detection and Classification

김진용, 김은경, 김성신

로봇학회논문지 제15권 제2호(통권 제56호) pp.184-189 한국로봇학회

This paper is a study on data augmentation for small dataset by using deep learning. In case of training a deep learning model for recognition and classification of non-mainstream objects, there is a limit to obtaining a large amount of training data. Therefore, this paper proposes a data augmentation method using perspective transform and image synthesis. In addition, it is necessary to save the object area for all training data to detect the object area. Thus, we devised a way to augment the data and save object regions at the same time. To verify the performance of the augmented data using the proposed method, an experiment was conducted to compare classification accuracy with the augmented data by the traditional method, and transfer learning was used in model learning. As experimental results, the model trained using the proposed method showed higher accuracy than the model trained using the traditional method.

2019.03 KCI 등재 서비스 종료(열람 제한)

수중 소나 영상 학습 데이터의 왜곡 및 회전 Augmentation을 통한 딥러닝 기반의 마커 검출 성능에 관한 연구

Study of Marker Detection Performance on Deep Learning via Distortion and Rotation Augmentation of Training Data on Underwater Sonar Image

이언호, 이영준, 최진우, 이세진

로봇학회논문지 제14권 제1호 (통권 제51호) pp.14-21 한국로봇학회

In the ground environment, mobile robot research uses sensors such as GPS and optical cameras to localize surrounding landmarks and to estimate the position of the robot. However, an underwater environment restricts the use of sensors such as optical cameras and GPS. Also, unlike the ground environment, it is difficult to make a continuous observation of landmarks for location estimation. So, in underwater research, artificial markers are installed to generate a strong and lasting landmark. When artificial markers are acquired with an underwater sonar sensor, different types of noise are caused in the underwater sonar image. This noise is one of the factors that reduces object detection performance. This paper aims to improve object detection performance through distortion and rotation augmentation of training data. Object detection is detected using a Faster R-CNN.