SynPAT: A System for Generating Synthetic Physical Theories with Data

📅 2025-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Physical law discovery suffers from a lack of high-quality, controllable benchmarks, hindering model training and rigorous evaluation. To address this, we propose the first unified framework for generating synthetic physical theories with full controllability: it formalizes axiom systems using first-order logic, enforces logical consistency via symbolic constraint solving, and introduces a progressive noise injection mechanism to generate realistic, noisy observational data. The framework supports both full-theory fitting and sub-theory discovery tasks. It integrates axiom verification and data fidelity assessment modules to ensure theoretical soundness and empirical plausibility. We generate multiple scalable synthetic theories and corresponding datasets, and conduct benchmark evaluations across three state-of-the-art symbolic regression systems. Results demonstrate significant improvements in robustness, interpretability, and theory discovery capability—establishing a new standard for evaluating physics-informed symbolic learning methods.

Technology Category

Application Category

📝 Abstract
Automated means for discovering new physical laws of nature, starting from a given background theory and data, have recently emerged and are proving to have great potential to someday advance our understanding of the physical world. However, the fact there there are relatively few known theories in the physical sciences has made the training, testing and benchmarking of these systems difficult. To address these needs we have developed SynPAT, a system for generating synthetic physical theories, comprising a set of consistent axioms, together with noisy data that are either good fits to the axioms, or good fits to a subset of the axioms. We give a detailed description of the inner workings of SynPAT and its various capabilities. We also report on our benchmarking of three recent open-source symbolic regression systems using our generated theories and data.
Problem

Research questions and friction points this paper is trying to address.

Generating synthetic physical theories for testing AI systems
Providing consistent axioms and noisy data for benchmarking
Evaluating symbolic regression systems with synthetic theories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates synthetic physical theories with axioms
Produces noisy data fitting axioms or subsets
Benchmarks symbolic regression systems effectively
🔎 Similar Papers
No similar papers found.
Jonathan Lenchner
Jonathan Lenchner
IBM T.J. Watson Research Center
Computational ComplexityCombinatorial & Computational GeometryRobotics
J
Joao Goncalves
IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
L
L. Horesh
IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
K
Karan Srivastava
University of Wisconsin-Madison, Madison, WI, USA