Will ChatGPT start to be used by doctors next?
The conversational AI bot ChatGPT is getting a lot of attention and promises to change how we write, search the web, and learn new things.
The most recent success on ChatGPT? It almost passed the US Medical Licensing Exam (USMLE), which is known to be a very difficult test that takes 300 to 400 hours to study for and covers everything from basic science to bioethics.
The USMLE is really three tests in one, and the fact that ChatGPT was able to answer its questions well shows that these AI bots may be useful in the future for medical education and even for making specific diagnoses.
“ChatGPT performed at or near the passing threshold on all three exams without any special training or reinforcement,” the researchers wrote in their paper for PLOS Digital Health.
“In its explanations, ChatGPT also showed a high level of consistency and insight.”
ChatGPT is a type of artificial intelligence called a large language model, or LLM. The predictive text feature on your phone is like the big brother of these LLMs, which are made for written answers. They can guess which words should go together in a phrase by looking at a lot of sample text and using complex algorithms.
That’s a bit of an oversimplification, but the point is that ChatGPT doesn’t really “know” anything, but by reading a lot of web content, it can come up with phrases that make sense on almost any topic.
The key phrase here is “sounds plausible,” though. Depending on how likely different words are, the AI may seem spooky smart or come to the most ridiculous conclusions.
For the study, researchers from the Ansible Health startup used USMLE sample questions and made sure that the answers were not on Google. So, they could be sure that ChatGPT would give new answers based on the information it had been taught.
When all three were put to the test, ChatGPT passed them all with scores between 52.4% and 75%. (Usually, you need around 60% to pass). In 88.9% of the responses, the researchers said that at least one major insight was anything that was “new, not obvious, and clinically valid.”
In a press release, the study’s authors said, “Getting a passing score on this notoriously hard expert exam without any human reinforcement is a big step forward in the development of clinical AI.”
ChatGPT was also very consistent in how it answered, and it could even explain why it answered the way it did. It also did better than PubMedGPT, a bot that was only taught from medical literature and only got 50.3% right.
It is important to remember that some of the data that ChatGPT was trained on is wrong. If you directly ask the bot, it will say that more work needs to be done to make LLMs more trustworthy. David Nield of Science Alert thinks that it will never replace doctors and nurses in the near future.
But there is no doubt that there is a lot more to learn about how to analyse internet knowledge, especially as these AI bots keep getting better. They might help doctors and nurses do their jobs better instead of taking their places.
The researchers wrote, “These results suggest that large language models might be able to help with medical education and maybe even clinical decision-making.”