Alberto Ballesteros-Rodríguez, Miguel Ángel Sicilia & Elena García-Barriocanal. madmpy: A Python library for creating and validating Data Management Plans
SoftwareX. 2025
"Automatic creation and validation of machine-actionable Data Management Plans". - Alberto Ballesteros-Rodríguez & Miguel Ángel Sicilia
Summary:
Data Management Plans (DMPs) are documents that describe the data used and produced during the course of research projects. Machine-actionable DMPs (maDMPs) are plans written in computer-readable formats. They are designed to support the automation of data-generation processes in scientific research. The madmpy Python package validates maDMPs that follow any version of the RDA DMP Common Standard. These plans can be written in JSON format or built programmatically. It also supports institution- or domain-specific extensions and additional validations that adhere to the standard. The library serves as a building block for research data engineering workflows. It promotes data management and accountability through the use of structured DMPs.
Why do you highlight this publication?
Data management plans are increasingly required by funding agencies, although their preparation can involve a considerable administrative burden. With madmpy, an open-source library for creating and validating Data Management Plans based on the Research Data Alliance (RDA) DMP Common Standard, these static, compliance-focused documents are transformed into machine-actionable artefacts that can be integrated into real research workflows, providing mechanisms that help data align with the FAIR principles from the beginning of the project. This leads to greater reproducibility of research studies and makes it easier to meet funders' requirements. In addition, standardising DMPs in a machine-readable format fosters cross-institutional collaboration and data reuse.
Publication commented by:
Alberto Ballesteros Rodríguez & Miguel Ángel Sicilia
BIOMEDICAL DATA SCIENCE AND ENGINEERING group
Data Science Central Support Units
IRYCIS