PennSyn2Real: Training Object Recognition Models without Human Labeling

Research paper by Ty Nguyen, Ian D. Miller, Avi Cohen, Dinesh Thakur, Shashank Prasad, Arjun Guru, Camillo J. Taylor, Pratik Chaudrahi, Vijay Kumar

Indexed on: 23 Sep '20Published on: 21 Sep '20Published in: arXiv - Computer Science - Computer Vision and Pattern Recognition


Scalability is a critical problem in generating training images for deep learning models. We propose PennSyn2Real - a photo-realistic synthetic dataset with more than 100, 000 4K images of more than 20 types of micro aerial vehicles (MAV) that can be used to generate an arbitrary number of training images for MAV detection and classification. Our data generation framework bootstraps chroma-keying, a matured cinematography technique with a motion tracking system, providing artifact-free and curated annotated images where object orientations and lighting are controlled. This framework is easy to set up and can be applied to a broad range of objects, reducing the gap between synthetic and real-world data. We demonstrate that CNNs trained on the synthetic data have on par performance with those trained on real-world data in both semantic segmentation and object detection setups.