The way we currently report human performance systematically underestimates it, making AI look better than it is.
Reports that CT scanning may be better than PCR testing for covid-19 are flawed and almost certainly wrong.
Since the CheXNet paper came out in November 2017 I have been communicating with the author team. I'm finally ready to review the paper. Some of the things I found out surprised me.
I just wanted to do a quick follow up to my recent blog post, which discussed the performance metrics I think might be appropriate for use in medical AI studies. One thing I didn't cover was the reason we might want to use multiple metrics, or the philosophy behind choosing the ones I did. So today, … Continue reading The philosophical argument for using ROC curves
Deep learning research in medicine is a bit like the Wild West at the moment; sometimes you find gold, sometimes a giant steampunk spider-bot causes a ruckus. This has derailed my series on whether AI will be replacing doctors soon, as I have felt the need to focus a bit more on how to assess … Continue reading Do machines actually beat doctors? ROC curves and performance metrics