๐ค AI Summary
This work investigates the association mechanism between sensory language and stylistic features, proposing the SLIM-LLMs frameworkโthe first to incorporate low-dimensional stylistic representations into nonlinear sensory language prediction modeling. Methodologically, it employs Rank-Revealing Ridge Regression (Rโด) to extract compact, interpretable, low-rank stylistic embeddings from LIWC-derived features, then integrates them into a lightweight nonlinear predictive architecture. Experiments across five diverse text genres demonstrate that SLIM-LLMs achieves performance on par with full-scale language models using only 20% of their parameters, markedly improving computational efficiency and model interpretability. Crucially, its low-dimensional stylistic encoding effectively captures cross-genre stylistic invariances, enabling robust sensory language analysis under resource constraints. This establishes a novel, efficient, and interpretable paradigm for sensory language modeling in low-resource settings.
๐ Abstract
Sensorial language -- the language connected to our senses including vision, sound, touch, taste, smell, and interoception, plays a fundamental role in how we communicate experiences and perceptions. We explore the relationship between sensorial language and traditional stylistic features, like those measured by LIWC, using a novel Reduced-Rank Ridge Regression (R4) approach. We demonstrate that low-dimensional latent representations of LIWC features r = 24 effectively capture stylistic information for sensorial language prediction compared to the full feature set (r = 74). We introduce Stylometrically Lean Interpretable Models (SLIM-LLMs), which model non-linear relationships between these style dimensions. Evaluated across five genres, SLIM-LLMs with low-rank LIWC features match the performance of full-scale language models while reducing parameters by up to 80%.