Strengths, Weaknesses & What Lies Ahead

What do we think of it so far?
Following on from my earlier anticipation post, here’s a grounded reflection on GPT-5 — what it delivers, where it stumbles, and why it matters for users seeking both power and nuance.
GPT 5 Honest First Impressions
An Enthusiastic Review
Mindstream praised GPT-5 as a significant leap forward, noting features like Agent AI, enhanced coding capabilities with reasoning, and seamless integration of text, images, spreadsheets, and audio. They celebrated its super-versatility — agile modes (everyday, Thinking, Pro), long-term memory, real-time app and web integration, and auto-model switching.
Their only critique? A humorous misstep: GPT-5 confidently answered there was “one ‘b’ in ‘Banana Bread’” — reminding us even advanced models still have shaky moments.
That kind of quirk aligns with my own experience — GPT-5 tackled everyday topics with ease, but struggled with object-count puzzles. Still, Mindstream captured well the sense of “AI that finally feels like a digital intern with real competence.”
Taking a Step Back
GPT-5 honest first impressions means listening to the good and then not so good. While GPT-5 shines technically, not all reviews match Mindstream’s optimism.
- The Verge bluntly titled it “failed the hype test,” pointing out that despite strong benchmarks (especially in coding), users found GPT-5 “less emotionally engaging,” less articulate in creative writing, and that the improvements felt incremental rather than revolutionary. OpenAI even reintroduced GPT-4o due to user backlash.
- TechRadar echoed these frustrations, listing four key complaints:
- Short, sanitised responses.
- Rigid, unimaginative thinking.
- Flat personality, lacking warmth.
- Loss of model choice, angering users who enjoyed switching.
- Wired noted that CEO Sam Altman himself admitted the rollout was “bumpy,” with auto-routing sometimes producing worse results than earlier models.
Criticism vs. Reality: Tone, Warmth and Confidence
Some reviewers say GPT-5 feels “less warm” or “flatter” than earlier models. In my daily use, I haven’t found that to be the case. If anything, GPT-5 adjusts tone more flexibly than ever — it can be warm, formal, cautious, or breezy depending on what you ask for.
The difference seems to be about defaults. GPT-5 often starts in a more concise, neutral style, and if someone doesn’t know they can nudge the tone, they may interpret that as a loss of “personality.”
What I’ve noticed instead is a different challenge: GPT-5 can sometimes sound very confident, even when it’s wrong. That’s not unique to version 5 — but it’s especially noticeable in areas where AI is weaker, like visual puzzles or object-counting tasks. We humans don’t mind an honest mistake, but a mistake delivered with full confidence can be hard to take.
A practical fix? Ask for a more cautious, less certain tone when you suspect the model might struggle. For example:
📝 Try This Prompt
If you want GPT-5 to be cautious rather than over-confident, copy and paste this into your AI chat:
Answer this carefully and explain where you could be wrong.
If there’s any uncertainty, show me the possible alternatives rather than one confident guess.
Reflections:
I’ll admit: my own experience mirrors some of the critical feedback. GPT-5 is undeniably strong in structured tasks and logic — yet its performance in that object-count puzzle reminded me that even the most advanced models can falter on simple, detail-oriented prompts.
Yet the technical potential remains compelling: coding benchmarks are among the best seen, with strong real-world coding performance, and users recognising GPT-5 as “unequivocally the best coding model in the world.”
Summary Table: Strengths vs. Trade-offs
| Strengths (Mindstream + other reviews) | Trade-offs & User Friction |
|---|---|
| Agent AI, multi-modal reasoning, auto-model switching, memory | Shortened, sanitised responses; less creativity or warmth |
| Outstanding coding performance and logic-heavy tasks | Loss of model variety; auto-routing sometimes degrades results |
| General improvements in accuracy, cost, and integrated workflows | Rollout issues and “shaky moments” in unexpected places |
Final Take: Balanced, Forward-Looking — and Realistic
- Yes, GPT-5 delivers significant technical enhancements — especially in coding, multi-modality, and agentic workflows.
- But, the emotional tone and creative breadth some users cherished took a hit. The rollout misstep and forced replacement of beloved models stirred real user backlash.
- How to evolve? The key is learning from both the strengths and the weaknesses. That means:
- Bringing back more choice for users,
- Allowing answers to be friendlier or more detailed when needed,
- And making it clearer when the system switches between versions, so people always know what they’re getting.
So, if you’re a developer or need powerful logic tools, GPT-5 could be a game-changer. If you’re a creative user or valued the more conversational warmth of GPT-4o, it may feel colder — at least for now.