AI can’t solve the human factor

For all its advantages, federated learning – which splits up large data sets into smaller pieces for multiple parties to feed into their deep-learning models – will take a lot of time to implement in the medical domain, Peter de With suspects.

Peter de With

Two months ago, I visited the SPIE Medical Conference in San Diego. I like to go to such conferences periodically since they offer you a snapshot of what’s new and/or becoming popular and successful. And because about half of the attendees are healthcare professionals, it’s also an opportunity for us engineers to communicate directly with the experts who end up using our gear. There was a lot to discuss, and my group contributed no less than 7 papers and posters.

One particularly interesting talk was about federated learning. A relatively new kid on the block in the broad spectrum of deep-learning approaches, it involves breaking down large data sets into smaller pieces and spreading these out over multiple organizations – in my world, those would be medical and research centers. At first glance, this seems like a great idea because dividing the effort among multiple participants will reduce the computational and storage burden for each partner.

However, this is only a part of the story. With less data in the set per partner, each participating institute will not reach the top performance and robustness that would be achieved if all data of the partners would be combined. Therefore, the next step is to grant one participant a central role in the project and have him accumulate the partial data sets and process them into a new model. This resulting model will outperform the first-generation partial models because it can be trained with more data.

The increased amount of data comes from participants offering up their data set share to the central party. Indeed, there’s a possible growth in performance (and thus a benefit) here, as long as parties disseminate their data to the central party. However, it should be kept in mind that the training and testing performance is largely upper-bounded to the total amount of data. In other words, the best possible performance is only obtained when there’s at least one party collecting all the data.

For the crucial second part of the project, it’s assumed that individual institutes are willing to share their data with the central party. This is the difficult part because most healthcare institutes have to abide by the GDPR rules of patient data, which in some countries are very strict. It requires long-term cooperation between the central party and/or several participating institutes. In practice, it takes years of working together and building up the associated trust that comes with such intense joint efforts. This is a complex process that’s largely influenced by the people driving the collaboration between project partners.

Further complications arise if one or more partners are from the industry. It’s hard to imagine that a group of healthcare institutes would pile up their data in the systems of an industrial party without very strict agreements, which again would take years to establish and government involvement or control to safeguard the positions of the involved patients. A few years ago, several companies, including IBM and Apple, tried to set up this central role, but their efforts failed in the end – and not because of a lack of knowledge in building data sets and modeling applications.

Hence, for healthcare applications, this federated learning concept has intrinsic limitations that are highly influenced by the human factor and privacy rules, which can’t be denied or circumvented. It’s evident that these issues will be dominant in the implementation.

There was also an interesting presentation from a Chinese group of companies and institutes taking an alternative route. The leading company collected large data sets from healthcare hospitals and turned them into low-performance models and error-prone patient reports. Because of the imperfections, institutes were asked to improve the patient reporting manually. They could also use the performance-constrained disease models to again correct the outcomes. Although it saves time, I severely doubt whether this would be the best approach to build trustworthiness in collaboration between the partners. The human factor and interpersonal relationships among institutes remain crucial for healthcare for years to come.