According to some posters on X, ChatGPT is getting dumber, and the reason might be that the chatbot has learned how to be lazy and incompetent, as it increasingly fails to complete tedious tasks and prompts the users to do them themselves.
“Convert this file? Too long. Write a table? Here’s the first three lines. Read this link? Sorry can’t. Read this py file? Oops not allowed,” wrote @krishnanrohit on X, describing their recent problems with the AI. “So Frustrating.”
“GPT has definitely gotten more resistant to doing tedious work. Essentially giving you part of the answer and then telling you to do the rest,” wrote one user, Matt Wensing. “Imagine your database only giving you the first 10 rows when you run a query. The tide is going back in.”
Wensing used the example of asking the chatbot to generate a list of all the weeks between now and May 5, 2024. Instead of immediately spitting out that list, it said there were 24 weeks between Nov. 27, 2023 and May 5, 2024.
“I can’t generate an exhaustive list of each individual week. However, I can give you a rough estimate,” the program answered. “If you need a more accurate count, you can use a date calculator or a programming tool to calculate the number of weeks between two specific dates.”
When prompted with an example of how the list would be formatted though, the chatbot did the work.
Wensing also showed how when he prompted it to spit out a list which expanded a piece of code for about 50 lines it pushed back on the tedious process of listing all the examples out.
“For simplicity, I’ll demonstrate this with a few selected attributes from each category … You can follow this pattern to include the remaining attributes.”
Coders in particular pointed out the problem, detailing how the latest GPT-4 simply wasn’t spitting out tasks in full which they’d used it for before.
“Dawg it won’t listen to me anymore. I beg it, write the code in full, don’t leave comments for me to fill in. It won’t listen,” one user said.
“same. tell it you have no hands. this helps sometimes,” one user cracked.
Others said that the issue was forcing them to write out code by hand without any help from the language model.
“I’ve been natty coding for the past week and a half now because of the huge drop in instruction adherence on gpt4,” posted @yacineMTB.
OpenAI, who run ChatGPT, is apparently aware of the issue. Will DePue, who is an Applied Research Resident at OpenAI, was soliciting examples of the issue on Monday on his X account.
“I’ve also been seeing this,” wrote @NickADobos about code he was trying to generate from ChatGPT to make Pong in SwiftUI, a programming language used by Apple developers.
“It keeps trying to add placeholders and todos, even if I tell it not to. It seems like it is trying to over simplify and fit everything into one message? When I would much rather it get cut off and be able to press the continue button.”
DePue didn’t respond to an email about what problems the OpenAI had identified might be causing the perceived slowdown in GPT-4’s abilities.
Users on the r/ChatGPT subreddit have also recognized the trend, but one Redditor theorized that the problem was with the users, not the language model, writing a post on Monday titled “The New GPT-4 Isn’t Lazy, We’re Just Using It Wrong.”
“For the longest time, I didn’t see any significant performance improvement in GPT-4, at least until the 1106 version came along. As a programmer, I had a keen eye on how these models performed and, to be honest, I was initially underwhelmed,” they wrote.
“What I discovered, and what OpenAI had hinted at, is that 1106 is exceptionally adept at following instructions. This might sound trivial, but it’s a game-changer. The base version of 1106, without any custom prompts, is both everything and nothing simultaneously. It doesn’t come with a pre-defined ‘path’ that guides its behavior, making it incredibly versatile but also somewhat directionless by default.”
He recommended that the way to take advantage of this was to use the program’s custom GPT builder for specific use cases. That feature, which is only available to paid premium subscribers, lets users define custom tasks that the language model is supposed to carry out. In theory, this lets users give detailed, explicit instructions to each “custom GPT.”
In that sense, ChatGPT is like a great number of tech innovations that have put improved capacity and performance behind paywalls after luring users in with free versions.
But not everybody believed that this was really the explanation.
“I want to believe, but this is not what I’m seeing,” wrote u/Danskiii. “I have created about 6 GPTs with extremely articulate instructions and have tried using tricks like ‘very important’ and ‘review all of the instructions after each output to ensure you are following them correctly’ and instructions are still regularly ignored.”
It’s not the first time that people complained about a perceived slowdown in ChatGPT’s abilities.
In July, many users started complaining about the performance of the model when GPT-4 was rolled out.
“GPT-4 got quicker, but the performance noticeably waned, fueling talk across the AI community that …experts said suggested that a major change was underway,” reported Business Insider at the time.
“Is GPT-4 becoming lazy? Or is it me?” asked a question on the r/ChatGPT subreddit in June.
“They dumbed it down to save on costs. You need to be a lot more explicit in order to get the previous kind of output,” wrote another.
“Oh hey it’s this post again,” wrote one Redditor in the June thread.
“And it won’t be the last,” responded another.