π€ AI Summary
Existing cross-domain models rely solely on raw time-domain representations for token aggregation, struggling to effectively capture global and frequency-domain characteristics. This work proposes FLaG, a novel module that uniquely integrates frequency-domain analysis with implicit attention gating: tokens are first mapped into the frequency domain via real-valued FFT, then spectral components are aggregated using learnable implicit queries, and finally, a channel-wise gating mechanism reconstructs an enhanced time-domain representation for downstream pooling. FLaG establishes a universal, plug-and-play aggregation mechanism applicable across modalities and reveals a spectral pattern wherein low-frequency components dominate semantic content while high-frequency ones encode sample-specific details. Experiments demonstrate significant performance gains on ESM2-8M antimicrobial peptide prediction and CIFAR-100 image classification, competitive results on IMDB and GLUE text benchmarks, and multidimensional probing confirms FLaGβs sensitivity to and effective exploitation of spectral structures.
π Abstract
Token aggregation is a common bottleneck in models that map token representations to sample-level predictions, yet most pooling methods operate only in the original token domain. We propose FLaG, a plug-in aggregation module that transforms token representations with the real FFT, summarizes spectral components with learnable latent queries, applies a channel-wise gate, and reconstructs enhanced time-domain tokens for final pooling. We evaluate FLaG on antimicrobial peptide (AMP) activity prediction with ESM2, image classification with ResNet18 on CIFAR-10 and CIFAR-100, and text classification with RoBERTa on IMDB and GLUE. FLaG achieves its clearest gains on the ESM2-8M antimicrobial peptide tasks and on CIFAR-100, while remaining competitive with strong text baselines on IMDB and GLUE. Then we probe its behavior on the AMP setting with band knockouts, gate summaries, residue perturbations, latent-query readouts, and structure-proxy stratification. We find that low-frequency bands contribute the most overall, and the remaining higher-band pattern is more sample-specific. The gate acts as a broadly shared spectral reweighting stage and the cross-attention patterns are sample-specific with mild query-wise differentiation, and higher-helix peptides exhibit stronger average spectral sensitivity in both bacteria. The supplementary materials, source code and data are released at https://www.healthinformaticslab.org/supp/ and https://github.com/Kewei2023/AMPCliff/tree/FLaG.