May 13, 2026 • Iker Ceballos

Training cancer models across 197 hospitals without moving a record

Acuratio's Federated Learning Module plugs into the SHARE health data space's Eclipse connectors and turns 197 GECP hospitals into a training fabric for lung-cancer prognosis and survival models.

Cancer research has a data-sharing problem. The records that would most accelerate prognosis and survival models — longitudinal patient data, treatment outcomes, multi-modal imaging — sit inside individual hospitals that legally cannot move them. Centralising them is rarely an option, and the scientific cost of that constraint is steep: lung cancer alone is responsible for more than 23,000 deaths a year in Spain, and the practical obstacle to better prognostic models has never been the science.

SHARE — the Spanish Health Data Space for Cancer Research — is the federated answer to that problem. Coordinated by Hospital Universitario Puerta de Hierro Majadahonda (HUPHM) and funded by the Spanish Ministry for Digital Transformation and Public Function through the EU NextGeneration-EU plan, it connects clinical, genomic, quality-of-life, and ambulatory-monitoring records on 30,000+ lung-cancer patients across the 197 hospitals of the Spanish Lung Cancer Group (GECP).

Acuratio is the federated-learning partner on SHARE. The piece we built and brought to the project is the Federated Learning Module that plugs into each hospital’s Eclipse Dataspace Components (EDC) connector and turns the dataspace into a training fabric — without moving a single clinical record.

The architecture is a Master–Slave system that lives natively inside the dataspace. A Federated Learning Slave runs alongside every hospital’s EDC connector, executing training inside the hospital’s network boundary and reading local data through the CLARIFY Data Proxy. A Federated Learning Master discovers participating nodes via the federated catalog, validates each participant’s credential (DID:WEB + Verifiable Credentials over DSP/DCP), distributes the global model, and aggregates round after round until convergence. Only model parameters travel; every authorisation and access event is logged for audit.

On top of the FL primitives we ship precompiled Rust binaries that implement Private Set Intersection, private identifier matching, and decryption routines over shared identifiers — the building blocks of cross-institution cohort analysis where even patient IDs are sensitive.

Integration tests with the CLARIFY Proxy, Connector, and the Acuratio Federated Master are already green end-to-end. The next phase scales the pattern from one provider to a federated set of GECP hospitals as both providers and consumers — laying a concrete precedent for Spain’s emerging National Health Data Space (ENDS) and for federated AI under the European Health Data Space.

The SHARE case study has the full architectural detail.