ForestMamba: Sparse Mamba with Geometry-guided Queries for 3D Forest Point Cloud Segmentation

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This study addresses the challenges of semantic and instance segmentation in forest LiDAR point clouds, which arise from massive data volume, uneven sampling, crown overlap, and geographic variability. To tackle these issues, the authors propose a novel approach that integrates forest ecological structural priors into the Mamba architecture for the first time. The method introduces a vertical-priority sparse voxel serialization strategy and a multi-scale canopy height model–guided geometric query initialization mechanism, along with a query decoder that fuses local neighborhood features with a dual-path Mamba structure. While maintaining linear computational complexity, the approach significantly improves tree separation accuracy in complex forested areas. Experiments across seven forest sites demonstrate consistent superiority over existing methods in both semantic and instance segmentation tasks, achieving a 3× faster inference speed and a 2.3× reduction in GPU memory consumption.

📝 Abstract

AI-based semantic and instance segmentation of terrestrial and drone LiDAR point clouds is emerging as a transformative approach for converting the complex 3D structure of forests into actionable information for forest monitoring and biodiversity assessment. However, forest LiDAR scenes remain highly challenging due to their large data volumes, irregular sampling density, overlapping and complex canopy structure, and geographic variability. Existing methods based on sparse convolutions or Transformers achieve promising results, but suffer from two key limitations: Quadratic complexity of attention scales poorly to large forest scenes, and Generic context modeling does not exploit forest structural priors, limiting tree separation in complex regions. To address these challenges, we propose ForestMamba, a structure-aware method that incorporates forest-specific priors into feature encoding, query generation, and query refinement, while replacing quadratic attention with linear-time state-space modeling. First, we introduce a sparse encoder with vertical-priority slab serialization that organizes sparse voxels into vertically coherent sequences for efficient long-range context modeling. Second, we propose a geometry-guided query initialization strategy based on an on-the-fly multi-scale Canopy Height Model (CHM), where canopy maxima provide ecologically meaningful query seeds, supplemented by Farthest Point Sampling (FPS) to cover understory trees. Third, we design a Mamba-based query decoder that combines local kNN voxel aggregation with a spatial dual-path Mamba for query refinement with linear computational complexity. Extensive experiments across seven forest regions demonstrate that ForestMamba consistently outperforms existing baselines in both segmentation tasks, while achieving 3 times faster inference and 2.3 times lower GPU memory than Transformer-based methods.

Problem

Research questions and friction points this paper is trying to address.

forest point cloud segmentation

LiDAR

structural priors

quadratic complexity

tree separation

Innovation

Methods, ideas, or system contributions that make the work stand out.

ForestMamba

state-space model

geometry-guided query