Matrix representation and GPU-optimized parallel B-spline computing

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

B-spline computation in large-scale CAD models suffers from poor CPU performance, while existing GPU-accelerated approaches—based on naïve porting—fail to exploit GPU parallelism due to the inherent irregularity and recursion of B-spline evaluation. Method: This paper proposes a matrix-oriented B-spline modeling framework tailored for GPU architectures. It introduces a novel full-matrix representation that reformulates recursive B-spline operations into structured matrix additions and multiplications. Furthermore, it designs a co-optimized memory access and task scheduling strategy, incorporating custom CUDA kernels, shared-memory optimization, and dynamic tiling. Contribution/Results: Experiments demonstrate approximately 100× speedup over CPU-based implementations and direct GPU porting schemes for core operations—including B-spline inversion and projection—effectively overcoming current performance bottlenecks in large-scale CAD modeling.

Technology Category

Application Category

📝 Abstract

B-spline modeling is fundamental to CAD systems, and its evaluation and manipulation algorithms currently in use were developed decades ago, specifically for CPU architectures. While remaining effective for many applications, these algorithms become increasingly inadequate as CAD models grow more complex, such as large-scale assemblies and microstructures. GPU acceleration offers a promising solution, but most existing GPU B-spline algorithms simply adapt CPU counterparts without accounting for the mismatch between the unstructured, recursive nature of B-splines and the structured nature of GPU kernels, ultimately failing to fully leverage GPU capabilities. This paper presents a novel approach that transforms B-spline representations into regular matrix structures, reducing all evaluation and manipulation computations to matrix addition and multiplication, thus better aligning with GPU architecture. By combining this matrix representation with GPU-optimized task scheduling and memory access patterns, the paper demonstrates significant performance improvements in the key B-spline operations of inversion and projection. Experimental results show an improvement of about two orders of magnitude in computational speed compared to existing methods.

Problem

Research questions and friction points this paper is trying to address.

Optimizing B-spline computation for modern GPU architectures

Transforming B-spline representations into matrix structures

Improving performance in B-spline inversion and projection operations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Matrix representation for B-spline computations

GPU-optimized parallel task scheduling

Efficient memory access patterns for GPUs

🔎 Similar Papers

UKAN: Unbound Kolmogorov-Arnold Network Accompanied with Accelerated Library

2024-08-20arXiv.orgCitations: 3

Authors to Follow