GMT: Guided Mask Transformer for Leaf Instance Segmentation

📅 2024-06-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fine-grained single-leaf instance segmentation in plants remains challenging due to high morphological similarity among leaves, large scale variations, severe occlusion, and scarcity of annotated training data. Method: We propose a Guided Mask Transformer that innovatively models leaf spatial distribution priors as learnable positional guidance functions integrated into the Transformer decoder, enabling geometry-aware embedding space disentanglement. Coupled with dynamic mask attention and few-shot adaptation training, it forms an end-to-end segmentation framework. Contribution/Results: Evaluated on three public plant datasets, our method consistently outperforms state-of-the-art approaches, particularly improving segmentation accuracy and instance discrimination robustness for small-scale leaves. It establishes a new paradigm for agricultural phenotyping—achieving high precision while significantly reducing reliance on labor-intensive pixel-level annotations.

Technology Category

Application Category

📝 Abstract
Leaf instance segmentation is a challenging multi-instance segmentation task, aiming to separate and delineate each leaf in an image of a plant. Accurate segmentation of each leaf is crucial for plant-related applications such as the fine-grained monitoring of plant growth and crop yield estimation. This task is challenging because of the high similarity (in shape and colour), great size variation, and heavy occlusions among leaf instances. Furthermore, the typically small size of annotated leaf datasets makes it more difficult to learn the distinctive features needed for precise segmentation. We hypothesise that the key to overcoming the these challenges lies in the specific spatial patterns of leaf distribution. In this paper, we propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor. These spatial priors are embedded in a set of guide functions that map leaves at different positions into a more separable embedding space. Our GMT consistently outperforms the state-of-the-art on three public plant datasets. Our code is available at https://github.com/vios-s/gmt-leaf-ins-seg.
Problem

Research questions and friction points this paper is trying to address.

Plant Leaf Recognition
Image Segmentation
Agricultural Yield Estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Guided Mask Transformer
spatial distribution
leaf image segmentation
🔎 Similar Papers
No similar papers found.
F
Feng Chen
School of Engineering, University of Edinburgh, United Kingdom
S
S. Tsaftaris
School of Engineering, University of Edinburgh, United Kingdom
M
M. Giuffrida
School of Computer Science, University of Nottingham, United Kingdom