Abstract
Accurate monitoring of crop phenotypic traits is essential for efficient farm management and automation in agriculture. Multi-object tracking (MOT) and video instance segmentation (VIS) offer promising approaches to enhance agricultural robotic-vision systems, yet a major limitation is the scarcity of high-quality spatial-temporal datasets. In this paper, we introduce BUP-ST20, a novel weakly labelled spatial-temporal dataset for sweet pepper tracking and segmentation captured on a robotic platform. Our dataset is generated by leveraging still image annotations and utilizing a neural radiance field approach (PAg-NeRF) to automatically obtain consistent object semantics and identities across video sequences. BUP-ST20 contains 16,240 images from 275 sequences, with weak labels for training and validation, and human-annotated ground truth for evaluation. We describe how this pseudo-labelling approach can be adapted to any robotic platform with the required inputs, greatly reducing the annotation requirements for dataset creation, with a focus on agriculture and horticulture. Utilizing BUP-ST20, we evaluate state-of-the-art MOT approaches and propose two novel tracklet matching criteria, enhancing robustness in frame-skipped scenarios and low frame rate cameras. When we decrease the frame rate to approximately 1 frame per second our offline MOT based matching criteria is able to improve performance by an absolute value of 19.63, outlining its validity as a tracklet aggregation technique in this scenario. Our experiments demonstrate the effectiveness of the dataset in benchmarking MOT and VIS techniques within the agricultural domain. This also allows us to highlight challenges such as occlusion, shape variations, and weak-labelling limitations. BUP-ST20 serves as a valuable resource for further advancements in robotic crop monitoring and agricultural automation, while demonstrating the ability to create future weakly labelled datasets using robotic platforms.
Keywords
Get full access to this article
View all access options for this article.
