Most GenAI chatbots are vulnerable to hallucinations when they’re fed fictional medical information

The news: GenAI models can easily be influenced to perpetuate false health facts when they’re fed made-up medical terms and information, per a new Mount Sinai study published in Nature last week.

Digging into the details: Researchers gave six popular chatbots, including ChatGPT and DeepSeek-R1, 300 fictional medical scenarios that mirrored clinical notes. The chatbots returned inaccurate information between 50% and 82% of the time.

Researchers tested the chatbots in two ways:

  • First, they only gave the chatbots scenarios with made-up details, such as a fictional lab test or medical condition.
  • Then, they added a warning note that some of the information might be inaccurate. Adding the warning reduced the average hallucination rate from 66% to 44%.

Zooming out: While GenAI chatbots are prone to perpetuate medical misinformation, they’re also providing fewer disclaimers on health information, per a study cited in MIT Technology review.

You've read 0 of 2 free articles this month.

Get more articles - create your free account today!