Agora | Research Hub

Inspire me, with New ideas

Do small language models generate realistic variable-quality fake news headlines?

Aug 30, 2025

Austin McCutcheon

🏛️ Lakehead University

This study investigates the capability of small language models (SLMs) to generate multi-tiered, high-fidelity fake news headlines under explicit prompting and their evasion potential against existing detection methods. Using controlled prompt engineering, we systematically evaluate 24,000 fake headlines generated by 14 SLMs, employing DistilBERT-based and ensemble classifiers for both quality grading and authenticity classification. Results show that SLMs reliably follow instructions to produce both high- and low-quality fake headlines; however, their outputs exhibit statistically significant semantic and stylistic divergence from authentic news headlines. Crucially, state-of-the-art detectors achieve only 35.2%–63.5% accuracy in identifying these SLM-generated fakes, revealing critical robustness gaps. This work constitutes the first systematic empirical analysis demonstrating the controllability, quality tunability, and detection vulnerability of SLMs in disinformation generation—providing foundational evidence and methodological guidance for developing resilient content safety mechanisms.

#Controlled prompt engineering for headline generation#DistilBERT and bagging classifiers for quality detection#Evaluating 14 small language models' fake news capabilities

00Detail ↗

Fault Tree Synthesis from Knowledge Graphs

Aug 29, 2025

Manzi Aimé Ntagengerwa

🏛️ University of Twente | Radboud University

In complex equipment fault diagnosis, domain expertise is difficult to formalize, and manual fault tree construction is inefficient. Method: This paper proposes a knowledge graph–based approach for automated fault tree synthesis. It introduces a lightweight, semantically rich knowledge graph representation that enables semi-automatic extraction of failure logic relationships from unstructured documents (e.g., maintenance manuals) and structured/functional models. Leveraging hierarchical modeling and semantic reasoning, the method generates fault trees in a fully structured manner—without requiring historical fault data, relying solely on engineering knowledge. Contribution/Results: The synthesized fault trees are inherently interpretable and accurately capture system-level failure propagation paths. Experimental validation on the Lycoming O-320 aircraft engine demonstrates substantial improvements in diagnostic modeling efficiency and engineering applicability.

#Knowledge graph format for fault trees#Structural and functional knowledge conceptual model#Synthesizing fault trees from system knowledge

00Detail ↗

Items Proxy Bridging: Enabling Frictionless Critiquing in Knowledge Graph Recommendations

Sep 30, 2025

Huanyu Zhang

Existing critique-based recommendation methods rely on dedicated models to construct user–keyword mappings, suffering from poor generalizability and catastrophic forgetting during multi-turn critique due to continuous parameter updates. Method: We propose a generic item-proxy mechanism that automatically translates user critiques on keywords into optimization objectives over items—requiring no architectural modification to baseline models and enabling plug-and-play integration into mainstream collaborative filtering frameworks. Furthermore, we design a forgetting-robust regularization strategy to mitigate performance degradation induced by multi-step parameter updates. Contribution/Results: In knowledge graph–enhanced recommendation scenarios, our approach enables real-time, frictionless multi-turn interactive optimization. Extensive experiments across multiple benchmark datasets demonstrate significant improvements in both recommendation stability and accuracy, while effectively alleviating catastrophic forgetting.

#Anti-forgetting regularizer mitigates catastrophic forgetting problem#Items proxy bridges users and keyphrases for critiquing#Universal plugin for knowledge graph recommender models

00Detail ↗

Re-envisioning Euclid Galaxy Morphology: Identifying and Interpreting Features with Sparse Autoencoders

Oct 27, 2025

John F. Wu

🏛️ Space Telescope Science Institute | Johns Hopkins University | University of Toronto

This study addresses the challenge of efficiently identifying and interpreting morphology-related, human-interpretable features in Euclid Q1 galaxy images—features that extend beyond the Galaxy Zoo decision-tree framework and are embedded within pre-trained neural networks. Method: We propose a feature disentanglement approach based on sparse autoencoders (SAEs), jointly leveraging the supervised Zoobot model and self-supervised masked autoencoding (MAE) to extract unambiguous, semantically meaningful galaxy morphology representations from Euclid Q1 data. Contribution/Results: Compared to conventional dimensionality-reduction methods (e.g., PCA), SAE-learned features exhibit significantly higher alignment with Galaxy Zoo labels and uncover novel, previously undefined astronomical structural patterns. The released MAE model achieves superhuman image reconstruction performance. To our knowledge, this work constitutes the first systematic effort to mine interpretable galaxy morphology features from pre-trained vision models, establishing a new paradigm for intelligent analysis of astronomical imagery.

#SAEs align better with Galaxy Zoo labels than PCA#SAEs find interpretable features beyond human classification#Sparse Autoencoders identify features from neural networks

00Detail ↗

Energy efficiency analysis of Spiking Neural Networks for space applications

May 16, 2025

P. Lunghi

🏛️ Politecnico di Milano | European Space Agency

Spacecraft onboard intelligence faces stringent energy constraints, necessitating energy-efficient neural network architectures. Method: This paper introduces the first hardware-agnostic spiking neural network (SNN) energy-efficiency evaluation framework tailored for spacecraft platforms, focusing on time-encoded SNNs applied to remote sensing scene classification (EuroSAT). We propose an analytically tractable architecture-parameter–energy mapping model that quantifies SNNs’ intrinsic temporal sparsity in terms of actual on-chip energy consumption, enabling cross-platform energy prediction—including for BrainChip Akida AKD1000. Contribution/Results: Empirical evaluation demonstrates that time-encoded SNNs achieve substantially lower theoretical energy consumption than conventional artificial neural networks (ANNs). The proposed model yields bounded prediction error for on-chip energy consumption on Akida under EuroSAT, validating its feasibility for high-energy-efficiency deployment in spaceborne AI systems.

#Develops hardware-agnostic metric for energy analysis#Focuses on temporal coding to maximize sparsity#Uses Spiking Neural Networks for energy efficiency

00Detail ↗

'The Boring and the Tedious': Invisible Labour in India's Gig-Economy

Apr 24, 2025

Pratyay Suvarnapathaki

🏛️ International Institute of Information Technology | IIIT Hyderabad

This study uncovers hidden labor precarity faced by food-delivery riders on Indian platforms (e.g., Swiggy, Zomato): prolonged waiting times and high-frequency, repetitive UI interactions induce “digital discomfort,” while opaque algorithmic governance and gamified incentive structures further erode worker autonomy. Drawing on in-depth interviews with 14 riders and integrating HCI analysis with critical algorithmic labor theory, the research identifies riders’ self-organized, strategic coping practices. It innovatively proposes a rider-centered GUI automation intervention framework—designed to reduce interaction friction without compromising labor agency. The findings advance Global South HCI scholarship toward a “worker empowerment–first” paradigm and provide empirically grounded, actionable design principles for labor-respectful digital interfaces. (149 words)

#Analyzing waiting time and repetitive UI interactions#Rethinking HCI for worker autonomy in Global South#Worker-centered GUI automation reduces digital discomfort

00Detail ↗

Mimosa: A Language for Asynchronous Implementation of Embedded Systems Software

Mar 04, 2025

Nikolaus Huber

🏛️ Uppsala University | Université Grenoble Alpes | University of Regensburg

To address the challenges of modeling time-triggered processes communicating via FIFOs in asynchronous reactive embedded systems—and the lack of side-effect support in conventional synchronous dataflow languages—this paper proposes Mimosa, a two-layer domain-specific language integrating time-triggered semantics with asynchronous coordination. Methodologically, Mimosa extends the Lustre dataflow syntax with explicit side-effect semantics and establishes a unified formal semantics framework comprising a process layer (textual rewriting) and a coordination layer (graphical rewriting). A prototype toolchain supports parsing, semantic interpretation, and simulation-based verification. Experimental evaluation demonstrates that Mimosa enables rigorous modeling of asynchronous temporal behaviors with side effects, achieving substantially greater expressiveness and semantic verifiability compared to existing synchronous dataflow languages.

#Extends Lustre with side-effectful computations#Formal semantics for process and coordination layers#Mimosa: asynchronous reactive systems language

00Detail ↗

What Did I Learn? Operational Competence Assessment for AI-Based Trajectory Planners

Oct 01, 2025

Michiel Braat

🏛️ Netherlands Organisation for Applied Scientific Research

AI-based trajectory planners for autonomous driving exhibit high operational risk and poor interpretability in unknown or under-trained scenarios. Method: This paper proposes a knowledge graph–based framework for explainable capability assessment. It models driving data as a structured knowledge graph to explicitly represent scene semantics in human-understandable form, and dynamically evaluates model proficiency across sub-scenarios via subgraph querying, scene complexity quantification, and dataset coverage analysis. Contribution/Results: Experiments on the NuPlan dataset demonstrate that the method effectively identifies high-risk driving situations where the planner is insufficiently trained, significantly enhancing the interpretability of evaluation outcomes and deployment trustworthiness. By grounding assessment in semantically meaningful, queryable structures, the approach establishes a novel paradigm for robustness verification of safety-critical AI systems.

#Estimating competence via coverage and complexity#Modeling driving data as knowledge graphs#Querying sub-scene configurations in datasets

00Detail ↗

On the mean-variance problem through the lens of multivariate fake stationary affine Volterra dynamics

Apr 01, 2026

This study addresses the continuous-time Markowitz mean–variance portfolio selection problem in a multidimensional affine Volterra market characterized by unbounded stochastic coefficients, non-Markovian dynamics, and non-semimartingale structure—settings where classical methods fail. For the first time in such a rough framework, the authors construct stochastic factor solutions to a Riccati-type backward stochastic differential equation and derive closed-form expressions for both the optimal feedback strategy and the efficient frontier via a multidimensional Riccati–Volterra equation. By integrating affine Volterra processes with numerical simulations of the rough Heston model, the approach reveals the pronounced impact of volatility roughness and stochastic correlation on optimal investment decisions, thereby providing a novel analytical toolkit for non-Markovian mean–variance optimization.

#affine Volterra processes#backward stochastic differential equations#mean-variance portfolio selection

00Detail ↗

Automated radiotherapy treatment planning guided by GPT-4Vision

Jun 21, 2024

Sheng Liu

🏛️ Stanford University

Radiotherapy treatment planning is time-consuming, highly subjective, and requires iterative trade-offs among conflicting clinical objectives. This paper introduces GPT-RadPlan—a novel automated framework that integrates a multimodal large language model (GPT-4V) as an intelligent agent directly into the clinical radiotherapy planning workflow. Without model fine-tuning, it achieves end-to-end inverse planning optimization via clinical-protocol-driven in-context learning and explicit dosimetric constraint modeling. Its key contribution lies in pioneering zero-shot multimodal reasoning—jointly interpreting medical images and structured clinical protocols—to replace conventional task-specific supervised training. Evaluated on prostate and head-and-neck cancer cases, GPT-RadPlan produces plans satisfying 100% of clinical dosimetric constraints, with superior target coverage and organ-at-risk sparing compared to manual plans—achieving quality at or exceeding expert-level standards.

#API links GPT-RadPlan to treatment system#Automated planning via in-context learning#GPT-4V integrates radiation oncology knowledge

50Detail ↗