blog / AI
AI28 March 20243 min read

Claude 3 vs GPT-4: a practitioner's comparison for business use

Anthropic launched the Claude 3 model family in March 2024. After several weeks of testing alongside GPT-4, here's a practical comparison for people who use these tools for real work.

by Matt Roberts

Anthropic released the Claude 3 model family on March 4th, 2024: Haiku, Sonnet, and Opus, positioned at different capability and cost tiers. Opus is positioned as their most capable model, benchmarking competitively with GPT-4 on a range of standard evaluations.

I've been running both alongside each other for real work tasks for the past few weeks. Here's how it breaks down.

Quick context on where I'm coming from

I'm not an AI researcher. I'm an IT professional who uses these tools for: writing and editing, PowerShell and Python scripting, technical research and summarisation, customer communication drafting, and general problem-solving. My comparison is practical, not academic.

Where Claude 3 Opus has an edge

Nuance in writing tasks

For anything requiring careful tone (a difficult email, a proposal that needs to land precisely right, communication that needs to balance honesty with diplomacy), I've found Opus marginally better than GPT-4. It seems to better pick up on implicit requirements in how I describe a task.

Long document handling

Claude 3's context window is 200,000 tokens for Opus. GPT-4 Turbo has 128,000 tokens. In practice, both are more than sufficient for most tasks, but for genuinely long documents (full product specifications, lengthy contracts, extended code reviews), Claude's additional headroom has mattered a few times.

Following complex instructions

When I give Claude a prompt with multiple constraints ("write this at this length, in this tone, for this audience, avoiding these topics, structured like this"), it tends to adhere to the full set of constraints more consistently than GPT-4, which sometimes drops one of several requirements.

Where GPT-4 has an edge

Code generation

For scripting tasks (PowerShell, Python, Graph API calls), GPT-4 remains my preference. The output tends to be closer to idiomatic, production-ready code. Claude's code output is good, but occasionally over-verbose or structured in ways that feel slightly academic.

Plugin and tool ecosystem

ChatGPT with GPT-4 has a broader plugin ecosystem and code interpreter, which adds useful capability for certain tasks. Claude.ai's interface is clean but more limited in these integrations.

Familiarity and predictability

I've been using GPT-4 for over a year. I know how to prompt it. I know where it struggles. That familiarity has real value in a working context.

Where I've ended up

These models are genuinely close in capability for the kind of work I do. The gap between them is smaller than the gap between either of them and GPT-3.5 was. Choosing between them for a specific use case is a matter of marginal differences, not transformative ones.

My current practice: I use GPT-4 as my default, Claude 3 Sonnet for writing-heavy tasks and long document work. For scripting, GPT-4 still has my preference.

If you haven't tried Claude 3 and you're a regular GPT-4 user, it's worth at least a few weeks of parallel testing. The differences are real, even if they're not dramatic. And the model landscape is moving fast enough that "I'm settled on X" is a position you should revisit regularly.

#claude-3#gpt-4#anthropic#openai#model-comparison
Share:X / TwitterLinkedIn

Related posts

Building my first AI-powered app: what I learned as a non-ML developer
AI

Building my first AI-powered app: what I learned as a non-ML developer

I built this website using Next.js, AWS, and Claude. I'm not a developer by trade. Here's an honest account of what that process was like and what surprised me.

14 May 20253 min read
Claude 3.7 and the rise of agentic AI — this is the inflection point
AI

Claude 3.7 and the rise of agentic AI — this is the inflection point

Anthropic released Claude 3.7 Sonnet in February 2025 with extended thinking mode. Combined with the MCP protocol, something important just shifted.

10 Mar 20253 min read
DeepSeek just changed the economics of AI. What it means for enterprise
AI

DeepSeek just changed the economics of AI. What it means for enterprise

DeepSeek R1 arrived in January 2025 and sent the AI industry into a brief panic. The dust has settled. Here's what actually happened and what it means for enterprise AI strategy.

28 Jan 20253 min read