Limited access to medical datasets, due to regulations that protect patient data, is a major hinderance to the development of machine learning models for computer-aided diagnosis tools using medical images. Distributed learning is an alternative to training machine learning models on centrally collected data that solves data sharing issues. The main idea of distributed learning is to train models remotely at each medical center rather than collecting the data in a central database, thereby avoiding sharing data between centers and model developers. In this work, we propose a travelling model that performs distributed learning for biological brain age prediction using morphological measurements of different brain structures. We specifically investigate the impact of nonidentically distributed data between collaborators on the performance of the travelling model. Our results, based on a large dataset of 2058 magnetic resonance imaging scans, demonstrate that transferring the model weights between the centers more frequently achieves results (mean age prediction error = 5.89 years) comparable to central learning implementations (mean age prediction error = 5.93 years), which were trained using the data from all sites hosted together at a central location. Moreover, we show that our model does not suffer from catastrophic forgetting, and that data distribution is less important than the number of times that the model travels between collaborators.
|