Geometric Coastline Localization using Vision-Language Models

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Traditional coastline extraction typically relies on pixel-level segmentation, which misaligns with operational monitoring practices that define coastlines based on geometric features such as the vegetation line or dune toe. This work addresses this discrepancy by modeling coastlines as geometric polylines and introducing a geometry-aligned learning paradigm tailored for operational monitoring. Building upon the GeoChat-7B/LLaVA-1.5 architecture, the proposed CoastlineVLM-7B model jointly performs coastline presence detection, proxy-type classification, and polyline localization, trained with a one-pixel boundary supervision strategy on high-resolution aerial imagery from New Zealand and the NZCCD dataset. Experimental results demonstrate significant improvements in localization accuracy, reducing the Hausdorff distance from 37.74 m to 31.84 m and the Earth Mover’s Distance from 21.12 m to 17.32 m, thereby validating that geometric metrics are better suited than IoU for coastline evaluation.

📝 Abstract

Coastline detection in remote sensing imagery is commonly formulated as a pixel-wise segmentation problem, where the final coastline is extracted from a predicted mask through post-processing. This formulation relegates coastline geometry, the primary representation used in coastal change analysis, to a secondary artifact rather than the learning objective. In practice, coastlines are defined by geomorphic proxies such as vegetation lines, dune toes, or cliff edges, rather than an instantaneous land-water boundary often used in pixel-based segmentation approaches. In this work, we revisit coastline extraction from a representation perspective and formulate the task as geometric boundary localization. We use the New Zealand Coastal Change Dataset (NZCCD) and high-resolution aerial imagery from Land Information New Zealand (LINZ) to develop CoastlineVLM-7B, a vision-language model (VLM) built on the GeoChat-7B/LLaVA-1.5 architecture that jointly performs coastline presence detection, proxy-type classification, and coastline grounding. The model directly predicts a coastline as a polyline rather than a dense segmentation mask. We evaluate CoastlineVLM-7B against segmentation baselines under strict one-pixel boundary supervision. Results show that geometry-based metrics are more suitable for assessing coastline localization quality than pixel-overlap metrics such as Intersection over Union (IoU). CoastlineVLM-7B improves global geometric alignment with reference coastlines, reducing Hausdorff distance from 37.74 m to 31.84 m and Earth Mover's Distance from 21.12 m to 17.32 m. These results indicate that output representation is a critical design choice in coastline extraction, and that geometry-oriented learning, combined with the semantic reasoning capabilities of vision-language models, aligns well with how coastlines are defined and evaluated in operational coastal monitoring.

Problem

Research questions and friction points this paper is trying to address.

coastline detection

geometric representation

remote sensing

vision-language models

coastal change analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

geometric boundary localization

vision-language model

coastline extraction