A study from Stanford University compared the performance of ChatGPT by running its versions “GPT-3.5” and “GPT-4” through a series of tests over the course of 4 months, from March to June 2023.
The study showed that March GPT-3.5 got the answers right 7.4% of the time, while the percentage of correct answers rose to 86.8% in June.
The results of the study on GPT-4 were more notable, with GPT-4 having a 96.7% percentage accuracy in March, but this dropped to only 2.4% by June.
The result was unexpected from the “sophisticated” chatbot, according to James Zuo, a Stanford computer science professor who was one of the study’s authors. The study proves that ChatGPT not only failed to answer math questions but was also unable to properly show how it got to its conclusions.
Editor’s Note: We have heard of teachers and students using ChatGPT as a kind of search engine. We wonder, how many utilized GPT’s answer for their lesson plans and term papers? Unfortunately, ChatGPT’s drop in “efficacy” is unnoticeable to the untrained eye, and its lies could be used to rewrite reality.
Based on this experience, perhaps we shouldn’t be so “trusting” of AI.
Read Original Article
Click the button below if you wish to read the article on the website where it was originally published.
Click the button below if you wish to read the article offline.