Research project: Can artificial intelligence improve m …

Soon AI should support doctors in their work | © roentgenbild_IMAGO YAY Images_xadam121

Applications based on artificial intelligence (AI) have already found their way into many areas of life. Healthcare is also undergoing a digital transformation in which numerous areas of application for AI systems are being discussed and researched with growing interest. One promising application area for AI systems is medical diagnostics: algorithms are being trained with large amounts of data to detect diseases based on medical images such as X-rays.

Such AI algorithms already exist for specific use cases, for example, in radiology or dermatology. The quality of these AI-supported diagnostics has been and is being validated in scientific studies, achieving performance comparable to physicians. So far, however, such AI algorithms have not been used much in practice. Therefore, at the moment, it is still unclear how AI systems can help to optimize the diagnostic quality of medical professionals in their daily work. Whether these AI systems will indeed be helpful does not solely depend on the quality of the algorithms.

How medical staff interacts with the system and their attitude toward the technology also plays an important role. To date, there has been little research into how healthcare workers deal with advice from AI algorithms. Also, we do not know yet, to what extent the potential benefits of technology will materialize.

An international and interdisciplinary research team led by Eva Lermer, senior researcher at the Center for Leadership and People Management at LMU, is investigating the interaction between humans and AI technology. They try to answer questions such as: What are the attitudes and expectations of medical staff towards the use of AI technology in healthcare? How do recommendations from an AI influence the diagnostic decisions and their quality of medical professionals? How must AI advice be presented to optimally support staff in their decisions?

The project is funded for four years by the Volkswagen Foundation and brings together researchers with expertise in psychology, medicine, and computer science. The core team of the project consists of scientists from the LMU, the Massachusetts Institute of Technology (MIT), the University of Toronto, and the University Hospital Regensburg. The project findings are expected to help develop novel AI-based systems (so-called clinical decision support systems) to improve human-machine interaction in diagnostics.

Previous findings about acceptance of AI systems

In a preliminary study, members of the project team investigated whether the source (AI algorithm vs. human radiologist) and accuracy (correct vs. incorrect) of advice impacted physician behavior. In the study published in npj digital medicine, the researchers analyzed whether these two factors had an influence on how subjects rate the quality of the advice. They further examined whether the source and accuracy of the advice affected the diagnostic performance of the physicians participating in the study.

Physicians with high task expertise rated the quality of the AI advice significantly lower than the human advice, regardless of whether it was correct or incorrect. However, the source of the advice ultimately had no effect on diagnostic performance. In contrast, diagnostic performance was highly dependent on whether the advice was correct or not, regardless of the expertise of the physicians who participated in the study. This shows that subjects relied heavily on the advice independent of its accuracy. These results thus suggest that current AI technology can lead to an improvement in the quality of diagnoses only if the algorithms are highly efficient and virtually immune to error. However, today's AI algorithms are not free of errors, so users should not blindly trust the advice. Rather they should only use them as a supplement to their own judgment.

The project team would like to investigate how to design the presentation of an AI advice in order to provide the best possible support for medical staff and thus optimize the diagnostic quality in hospitals. In doing so, it is essential to create acceptance for AI systems without creating blind trust. "In the first study, the advice of the system was presented together with the X-ray image based on which the diagnosis had to be made," explains Susanne Gaube, a postdoctoral researcher in the team at the LMU and first author of the study. "It would be conceivable, for example, to present the AI advice only after a diagnosis has been made - as feedback, so to speak." In addition to the timing of the presentation, a whole series of other factors will be investigated. The results of the experiments will then serve as the basis for developing a user interface for an AI application. In the final step, the researchers plan to test the optimized system in a natural setting.

With our project, we want to do both applied research and basic research.

Eva Lermer

Not limited to medical applications

The applicability of the research project is not limited to medical practice. Today, AI systems are already in use in various contexts or will be available in the near future. Another promising application might be human resources. CVs could be automatically screened by algorithms to filter potentially suitable employees. However, it should be noted that the algorithms are only as good as the data sets with which they have been trained. Biases that already exist in the training data, such as preference for one gender or discrimination against populations with a migration background, will be further amplified by the algorithms. It is therefore essential that users of AI systems learn how the advice is generated in order to identify and react appropriately to systematic problems and errors.

"With our project, we want to do both applied research and basic research. It is important to us to find universally valid approaches with which we can increase the benefits of AI systems for society," explains Eva Lermer.

Research project: Can artificial intelligence improve medical diagnostics?

Previous findings about acceptance of AI systems

Not limited to medical applications

What are you looking for?