This study investigates using Conditional Tabular Generative Adversarial Networks (CT-GAN) to generate synthetic data for turnover prediction in large employment datasets. The effectiveness of CT-GAN is compared with Adaptive Synthetic Sampling (ADASYN), Synthetic Minority Over-sampling Technique (SMOTE), and Random Oversampling (ROS) using Logistic Regression (LR), Linear Discriminant Analysis (LDA), Random Forest (RF), and Extreme Learning Machines (ELM), evaluated with AUC and F1-scores. Results show that GAN-based techniques, especially CT-GAN, outperform traditional methods in addressing data imbalance, highlighting the need for advanced oversampling methods to improve classification accuracy in imbalanced datasets.
This study integrates TabTransformer and CTGAN for predicting job satisfaction among South Korean college graduates. TabTransformer handles complex tabular data relationships with self-attention, while CTGAN generates high-quality synthetic samples. The combined approach achieves an accuracy of 0.85, precision of 0.83, recall of 0.82, F1-score of 0.82, and an AUC of 0.88. Cross-validation confirms the model's robustness and generalizability with a mean accuracy of 0.85 and a standard deviation of 0.008. The integration of TabTransformer and CTGAN enhances predictive accuracy and model generalizability, providing valuable insights for employment policy and research.