🤖 AI Summary
To address the scarcity of real-world cyclist data and insufficient diversity in pose and appearance for autonomous driving perception, this paper proposes the first controllable synthetic data generation framework based on 3D Gaussian Splatting (3DGS). Methodologically: (1) we design a parameterized joint 3D model of bicycle and rider, enabling fine-grained control over eight degrees of freedom; (2) we introduce a 3D keypoint-optimized inverse kinematics assembly strategy to ensure physical plausibility; and (3) we integrate multi-view part-level synthesis, video-driven motion retargeting, and high-fidelity 3DGS rendering to generate dynamic, photorealistic, and pose-controllable cyclist sequences. Experiments demonstrate that the synthesized data significantly outperforms Stable Diffusion–based approaches on downstream tasks—including semantic segmentation and pose estimation—while effectively enabling novel research directions such as fine-grained pose modeling and spatiotemporal human–vehicle interaction analysis.
📝 Abstract
In Autonomous Driving (AD) Perception, cyclists are considered safety-critical scene objects. Commonly used publicly-available AD datasets typically contain large amounts of car and vehicle object instances but a low number of cyclist instances, usually with limited appearance and pose diversity. This cyclist training data scarcity problem not only limits the generalization of deep-learning perception models for cyclist semantic segmentation, pose estimation, and cyclist crossing intention prediction, but also limits research on new cyclist-related tasks such as fine-grained cyclist pose estimation and spatio-temporal analysis under complex interactions between humans and articulated objects. To address this data scarcity problem, in this paper we propose a framework to generate synthetic dynamic 3D cyclist data assets that can be used to generate training data for different tasks. In our framework, we designed a methodology for creating a new part-based multi-view articulated synthetic 3D bicycle dataset that we call 3DArticBikes that we use to train a 3D Gaussian Splatting (3DGS)-based reconstruction and image rendering method. We then propose a parametric bicycle 3DGS composition model to assemble 8-DoF pose-controllable 3D bicycles. Finally, using dynamic information from cyclist videos, we build a complete synthetic dynamic 3D cyclist (rider pedaling a bicycle) by re-posing a selectable synthetic 3D person, while automatically placing the rider onto one of our new articulated 3D bicycles using a proposed 3D Keypoint optimization-based Inverse Kinematics pose refinement. We present both, qualitative and quantitative results where we compare our generated cyclists against those from a recent stable diffusion-based method.