ESC 2022: Ultrasound AI outperforms human clinicians in randomized, blinded study

An artificial intelligence program developed by researchers at Stanford University was able to streamline ultrasound exams of the heart—to the point where trained cardiologists couldn’t tell whether the initial assessments came from a machine. 

In a novel blinded and randomized controlled trial, clinicians at Cedars-Sinai Medical Center began by performing echocardiograms on patients as usual: with the sonographer logging their first readings of the heart’s left ventricular ejection fraction, or a measure of the cardiac muscle’s strength in pumping blood out to the rest of the body.

However, video from half of the recorded ultrasound scans were routed into an AI program that calculated those measurements automatically. Those findings were then mixed in with the sonographer-led reports—performed by technicians with an average of more than 14 years’ experience—before all being presented to trained cardiologists for final review.

The study found that the specialists made fewer corrections to AI-generated reports compared to the human clinicians’. Out of nearly 3,500 total echocardiograms, only 16.8% in the AI group saw substantial changes, versus 27.2% in the sonographer group. 

The AI’s instant readings were also closer to the mark, on average. They largely came within 2.8 percentage points of the corrected ejection fraction readings—compared to 3.8 points among the human clinicians, who had spent a median of about two minutes per patient to come to their conclusions. 

The study’s results were highlighted at the annual meeting of the European Society of Cardiology in Barcelona, Spain, which this year placed a special focus on cardiac imaging.

“There has been much excitement about the use of AI in medicine, but the technologies are rarely assessed in prospective clinical trials,” said the study’s presenter, David Ouyang, M.D., a cardiologist and researcher at Cedars-Sinai’s Smidt Heart Institute.

“Currently, FDA 510(k) clearances or European CE marks do not require prospective clinical trials,” Ouyang said during his ESC presentation. “We did an analysis last year, and across all of medicine for cleared AI technologies, 98% only used retrospective data, 75% were single-site, none were randomized, and none were blinded.”

“Blinding prevents bias either to or for the artificial intelligence,” he added. 

The participating cardiologists were also asked to guess whether they thought the echocardiogram reports, including tracings that outlined the shape of the heart's left ventricle, came from a human sonographer or an AI program. They choose correctly only about a third of the time—answering incorrectly in about 25% of cases and saying they were too uncertain and couldn’t make a guess in the remaining 43%.

“We learned a lot from running a randomized trial of an AI algorithm, which hasn’t been done before in cardiology,” Ouyang said. “First, we learned that this type of trial is highly feasible in the right setting, where the AI algorithm can be integrated into the usual clinical workflow in a blinded fashion. Second, we learned that blinding really can work well in this situation.” 

“What this means for the future is that certain AI algorithms, if developed and integrated in the right way, could be very effective at not only improving the quality of echo reading output but also increasing efficiencies in time and effort spent by sonographers and cardiologists by simplifying otherwise tedious but important tasks.”

James Thomas, M.D., a professor of cardiology at Northwestern University, was invited onstage following the presentation to discuss the study’s findings.

“How can we prove value in daily clinical practice? That’s really where you have to have a blinded clinical trial,” Thomas said. “And I congratulate you for showing that it appears to be significantly improved over sonographers laboriously tracing [ventricle boundaries in ultrasound scans]—there will be a gold medal in your honor from the sonographer community if they could stop doing all of that.” 

Still, further studies evaluating the AI’s performance in multiple health centers and different pieces of hardware will be needed, and regulatory clearances would be necessary before integrating the computer’s helping hand into today’s echocardiogram systems, Thomas said.

“Can we look forward to a day when there’s an app store for AI algorithms, some of which have been cleared by the FDA?” he asked.

Stanford’s AI developers have previously worked with Cedars-Sinai cardiologists to develop programs that can help detect rare and often overlooked heart conditions. 

Earlier this year, in a study published in JAMA Cardiology, their ultrasound-reading algorithm showed it could flag cases of hypertrophic cardiomyopathy, where the heart muscle thickens, and cardiac amyloidosis, where the walls of the heart become stiff with protein deposits. Both conditions make it difficult for the heart to pump blood properly but may not always display noticeable symptoms.

That study also showed the program could identify high-risk patients with more accuracy than clinical experts could by examining subtle differences within each frame of a video taken by a cardiac ultrasound.