Data Science


Miguel Ángel Sicilia Urbán


Diego Cárdenas Cuadrado

Jorge Lázaro Bailón

Samuel Santos Benito


Edificio Hospital. Planta -3 dcha

Responsable Unidad: msicilia(ELIMINAR)

Servicios REDCap: redcap(ELIMINAR) 

Unit created in January 2021, as a unit supervised by the R+D+i Office of the Fundación para la Investigación Biomédica del Hospital Universitario Ramón y Cajal, the management body of the IRYCIS, with the main mission of extracting knowledge from both structured and unstructured information, by establishing homogeneous criteria for registration, digitization and protection of clinical data for research at the IRYCIS, as well as the management and exploitation of this information to establish and consolidate a transversal line of research related to data mining, Artificial Intelligence and process mining in the Big Data environment. 

  • Software equipment
    • Microsoft Power BI, mETABASE AND aPACHE sUPERSET
    • ProM/Disco (process mining).
    • Python scientific computing stack
    • R scientific computing stack
    • Other data engineering tools
  • Noteworthy
    • Research experience (07/25/2023). Google Scholar data 7176 citations; H-index =43; i-index i10=176.
    • Scopus: 8266687800
    • ORCID:
    • Professor of Computer Languages and Systems in the Department of Computer Science at the University of Alcalá.
  • Service portfolio



    Data Science

    Design and creation of ad hoc standardized clinical data collection applications (REDCap).
    Creation of ad hoc standardized mobile PREMs (Patient Reported Experience Measures) and PROMs (Patient Reported Outcome Measures) collection tools (MyCap).  
    Standardization, harmonization and uploading of existing databases to the REDCap environment.  
    Database preprocessing (cleaning, integration, transformation and reduction).  
    Exploitation of the usefulness of medical data through visualization and Business Intelligence techniques: creation of interactive dashboards.  

    Scheduling of updates and communication between standardized data collection tools and interactive dashboards.


    Data mining in healthcare repositories aimed at the creation of predictive systems.

    Process mining: discovering, monitoring and improving real processes by extracting knowledge from available event logs.  
    Development of Decision Support Systems based on clinical practice guidelines.  
    Teaching capacity  
    Ad-hoc training for different levels: 
    • Interactive dashboards (Power BI, others).
    • Process mining (ProM or Disco).
    • Data analytics (R, Python).
    • Standardised data collection (REDCap).
    • Data science and data engineering tools in general