
ChatGPT passes neurology exam with 85% score
The large language model GPT-4 from OpenAI correctly answered 85% of the questions on the American Board of Psychiatry and Neurology test — the average human score is 73.8%.
The study was conducted by a group of German researchers from a university hospital and cancer centre in Heidelberg.
For comparison: the earlier GPT-3.5 version scored only 66.8%. At the same time, both models demonstrated low effectiveness in tasks requiring higher-order thinking.
Experts say the results warrant using language models in clinical neurology after ‘some modifications’.
“We regard our study more as a validation of the concept of LLM capabilities. Development is still required and, perhaps, even specific fine-tuning of language models to make them suitable for clinical neurology,” said Dr. Varun Venkataramani, the study’s lead.
In July, developers released a new plugin for ChatGPT, which can analyse data, generate Python code, create graphs and solve mathematical problems. The neural networks managed to debunk the ‘Flat Earth’ theory.
Earlier, researchers at Stanford and the University of California published a study claiming that OpenAI’s latest chatbot models have begun performing worse after interacting with real users.
As noted in August, analysts observed that AI systems are about 15% more successful at solving CAPTCHA than humans.
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!