PyMilo: A Python Library for ML I/O

📅 2024-12-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning model serialization faces challenges in reliability, security, and interpretability—hindering model reuse and production deployment. To address these, we propose a transparent, non-executable serialization paradigm for ML models, based on abstract syntax tree (AST) parsing and structured JSON encoding. Our approach integrates metadata extraction and operator-level semantic reconstruction to achieve lossless, human-readable, execution-free serialization with zero runtime dependencies. The method supports round-trip export across environments for scikit-learn (≥98% of models), XGBoost, and LightGBM, ensuring 100% deployment compatibility. Empirical validation in financial and healthcare domains confirms zero deserialization vulnerabilities. While serialized artifacts are ~40% larger than pickle-based counterparts, our solution guarantees full auditability, deterministic behavior, and strict security control—enabling safe, transparent, and portable model exchange without compromising fidelity or operational safety.

Technology Category

Application Category

📝 Abstract
PyMilo is an open-source Python package that addresses the limitations of existing Machine Learning (ML) model storage formats by providing a transparent, reliable, and safe method for exporting and deploying trained models. Current formats, such as pickle and other binary formats, have significant problems, such as reliability, safety, and transparency issues. In contrast, PyMilo serializes ML models in a transparent non-executable format, enabling straightforward and safe model exchange, while also facilitating the deserialization and deployment of exported models in production environments. This package aims to provide a seamless, end-to-end solution for the exportation and importation of pre-trained ML models, which simplifies the model development and deployment pipeline.
Problem

Research questions and friction points this paper is trying to address.

Model Storage
Model Sharing
Machine Learning Deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

PyMilo
model storage
secure sharing
🔎 Similar Papers
No similar papers found.
A
AmirHosein Rostami
University of Toronto, Open Science Lab
Sepand Haghighi
Sepand Haghighi
Researcher, Open Science Lab
Machine LearningNeural NetworkOpen SourceInnovation
Sadra Sabouri
Sadra Sabouri
University of Southern California
HCINLPLLMSE
A
A. Zolanvari
University of Groningen, Open Science Lab