Abstract
Image segmentation, which becomes more and more prevalent in computer vision, plays a requisite part in the fields of object detection, tracking and even virtual or augmented reality. Early segmentation methods that relied on hand-crafted features have fast been superseded by deep learning algorithms. Nonetheless, deep learning algorithms are hardly applied in real object segmentation because of a lack of ground truth labels. This work introduces the use of 3D models to generate segmentation training dataset. This system projects 3D models to the 2D plane and merges 2D images with different backgrounds to obtain training images. In this process, the ground truth labels would be allowed to obtain automatically without manual annotation, since the position of objects is known in the picture. Experimental results indicate that synthetic images can be used to train on existed networks such as FCNs and DeepLab and trained models achieve relatively accurate segmentation results on real images. Moreover, the modified model based on DeepLab-CRF-LargeFOV achieves more precise segmentation results by strengthening its localization and edge performance.
Get full access to this article
View all access options for this article.
