PURPOSES : This study aimed to compare the object detection performance based on various analysis methods using point-cloud data collected from LiDAR sensors with the goal of contributing to safer road environments. The findings of this study provide essential information that enables automated vehicles to accurately perceive their surroundings and effectively avoid potential hazards. Furthermore, they serve as a foundation for LiDAR sensor application to traffic monitoring, thereby enabling the collection and analysis of real-time traffic data in road environments. METHODS : Object detection was performed using models based on different point-cloud processing methods using the KITTI dataset, which consists of real-world driving environment data. The models included PointPillars for the voxel-based approach, PartA2-Net for the point-based approach, and PV-RCNN for the point+voxel-based approach. The performance of each model was compared using the mean average precision (mAP) metric. RESULTS : While all models exhibited a strong performance, PV-RCNN achieved the highest performance across easy, moderate, and hard difficulty levels. PV-RCNN outperformed the other models in bounding box (Bbox), bird’s eye view (BEV), and 3D object detection tasks. These results highlight PV-RCNN's ability to maintain a high performance across diverse driving environments by combining the efficiency of the voxel-based method with the precision of the point-based method. These findings provide foundational insights not only for automated vehicles but also for traffic detection, enabling the accurate detection of various objects in complex road environments. In urban settings, models such as PV-RCNN may be more suitable, whereas in situations requiring real-time processing efficiency, the voxelbased PointPillars model could be advantageous. These findings offer important insights into the model that is best suited for specific scenarios. CONCLUSIONS : The findings of this study aid enhance the safety and reliability of automated driving systems by enabling vehicles to perceive their surroundings accurately and avoid potential hazards at an early stage. Furthermore, the use of LiDAR sensors for traffic monitoring is expected to optimize traffic flow by collecting and analyzing real-time traffic data from road environments.
One of the most frequently performed tasks in human-robot interaction (HRI), intelligent vehicles, and security systems is face related applications such as face recognition, facial expression recognition, driver state monitoring, and gaze estimation. In these applications, accurate head pose estimation is an important issue. However, conventional methods have been lacking in accuracy, robustness or processing speed in practical use. In this paper, we propose a novel method for estimating head pose with a monocular camera. The proposed algorithm is based on a deep neural network for multi-task learning using a small grayscale image. This network jointly detects multi-view faces and estimates head pose in hard environmental conditions such as illumination change and large pose change. The proposed framework quantitatively and qualitatively outperforms the state-of-the-art method with an average head pose mean error of less than 4.5° in real-time.