Microsoft Says Its AI Medical Tool is 4x Better Than Doctors at Diagnosing Patients
/By Crystal Lindell
Microsoft is making some bold claims about the medical diagnosis tool it’s developing using artificial intelligence (AI). The company claims it is four times more accurate than a group of experienced physicians and can “solve medicine’s most complex diagnostic challenges.”.
Specifically, Microsoft’s AI Diagnostic Orchestrator – MAI-DxO for short — was able to correctly diagnose 85% of complex medical cases published in the New England Journal of Medicine (NEJM).
By comparison, when the company asked 21 practicing physicians from the US and UK to look at the same medical cases and provide a diagnosis, the human doctors were only accurate 20% of the time.
In a demonstration video, Microsoft showed that MAI-DxO was able to order medical tests and provide the estimated financial costs for each test. It was then able to evaluate the test results and arrive at the correct diagnosis, even if the diagnosis was for an incredibly rare disease.
“Our MAI-DxO orchestrator can handle some of the world’s toughest diagnoses with higher accuracy and lower costs. It puts us on the path to medical superintelligence - a big step towards better, more accessible care for all,” Microsoft said.
The company said it began testing its medical AI diagnosis systems with the United States Medical Licensing Examination, which is the same exam that physicians must pass to practice medicine in the United States. The test is a standardized assessment of clinical knowledge and decision making.
But the fact that it was a standardized test made it too easy for AI. Microsoft said its orchestrator was able to get near-perfect scores within just three years.
"These tests primarily rely on multiple-choice questions, which favor memorization over deep understanding," the company said. "By reducing medicine to one-shot answers on multiple-choice questions, such benchmarks overstate the apparent competence of AI systems and obscure their limitations."
To make its evaluations more challenging, Microsoft turned to having its AI evaluate the real-life cases published in the NEJM.
MAI-DxO was configured to operate within different sets of cost constraints – just like in the real world when a patient’s care may be determined by what kind of health insurance they have, if any. That’s an important feature because without financial constraints, the orchestrator might default to ordering every possible test – regardless of cost, delays in care, or patient discomfort.
They also found that MAI-DxO delivered both higher diagnostic accuracy and lower overall costs than physicians or any other model they tested.
"AI [could] reduce unnecessary healthcare costs,” the company said. "This kind of reasoning has the potential to reshape healthcare.”
Microsoft also touched on something that many patients with complex health challenges already know: the medical system is often overly reliant on siloed medical specialists. That’s another area where the company sees AI potentially improving patient care.
With general practitioners treating a wide array of conditions and specialists focused on a single area of expertise, the hope is that AI would essentially be able to pull medical knowledge from both.
While MAI-DxO seems to excel at tackling the most complex diagnostic challenges, Microsoft says further testing is needed to assess its performance on more common, everyday health conditions.
They also acknowledged that the clinicians in their study worked without access to colleagues, textbooks, or even AI – all tools that they may have in their day-to-day clinical practice. This was done to enable a fair comparison to raw human performance, but it also means that its unclear just how well AI actually competes against real-world physicians.
MAI-DxO is not available for commercial use yet. Microsoft said they need to do more testing to evaluate its reliability, safety, and efficacy. That could take about a decade.
“It’s pretty clear that we are on a path to these systems getting almost error-free in the next 5-10 years. It will be a massive weight off the shoulders of all health systems around the world,” Mustafa Suleyman, chief executive of Microsoft AI, told The Guardian.