Developers Say GPT-5 Is a Mixed Bag

Some developers say they’ve had largely positive experiences with GPT-5 so far. Jenny Wang, an engineer, investor, and creator of the personal styling agent Alta, told WIRED the model appears to be better at completing complex coding tasks in one shot than other models. She compared it to OpenAI’s o3 and 4o, which she uses frequently for code generation and straightforward fixes “like formatting, or if I want to create an API endpoint similar to what I already have,” Wang says.

In her tests of GPT-5, Wang says she asked the model to generate code for a press page for her company’s website, including specific design elements that would match the rest of the site’s aesthetic. GPT-5 completed the task in one take, whereas in the past, Wang would have had to revise her prompts during the process. There was one significant error, though: “It hallucinated the URLs,” Wang says.

Another developer, who spoke on the condition of anonymity because their employer didn’t authorize them to speak to the press, says GPT-5 excels at solving deep technical problems.

The developer’s current hobby project is writing a programmatic network analysis tool, one that would require code isolation for security purposes. “I basically presented my project and some paths I was considering, and GPT-5 took it all in and gave back a few recommendations along with a realistic timeline,” the developer explains. “I’m impressed.”

A handful of OpenAI’s enterprise partners and customers, including Cursor, Windsurf, and Notion, have publicly vouched for GPT-5’s coding and reasoning skills. (OpenAI included many of these remarks in its own blog post announcing the new model.) Notion also shared on X that it’s “fast, thorough, and handles complex work 15 percent better than other models we’ve tested.”

But within days of GPT-5’s release, some developers were weighing in online with complaints. Many said that GPT-5’s coding abilities seemed behind the curve for what was supposed to be a state-of-the-art, ultra-capable model from the world’s buzziest AI company.

“OpenAI’s GPT-5 is very good, but it seems like something that would have been released a year ago,” says Kieran Klassen, a developer who has been building an AI assistant for email inboxes. “Its coding capabilities remind me of Sonnet 3.5,” he adds, referring to an Anthropic model that launched in June 2024.

Amir Salihefendić, founder of the startup company Doist, said in a social media post that he’s been using GPT-5 in Cursor and has found it “pretty underwhelming” and that “it’s especially bad at coding.” He said the release of GPT-4 felt like a “Llama 4 moment,” referring to Meta’s AI model, which had also disappointed some people in the AI community.

On X, developer Mckay Wrigley wrote that GPT-5 is a “phenomenal everyday chat model,” but when it comes to coding, “I will still be using Claude Code + Opus.”

Other developers describe GPT-5 as “exhaustive”—at times helpful, but often irritating in its long-windedness. Wang, who was pleased overall with the frontend coding project she assigned to GPT-5, says that she did notice that the model was “more redundant. It clearly could have come up with a cleaner or shorter solution.” (Kapoor points out that the verbosity of GPT-5 can be adjusted, so that users can ask it to be less chatty or even do less reasoning in exchange for better performance or cheaper pricing.)

What's Hot

Related Posts

Leave A Reply Cancel Reply