Key insights of today’s newsletter:
This week Anthropic launched Claude 3.5 Sonnet and it is outperforming all the other state-of-the-art language models on common benchmarks.
Not only is the model more powerful, it’s also cheaper and faster to use, which is great news for those building generative AI applications.
The chat interface of Claude also got revamped. New features like Projects and Artifacts are looking to drastically improve the user experience.
↓ Go deeper (6 min read)
Have you met Claude yet? Claude is the AI chatbot of Anthropic and the name of the model that powers it. Anthropic, for those who don’t know, is one of the biggest if not the biggest rival of OpenAI. This week they announced their newest model: Claude 3.5 Sonnet. You can talk to it here.
For the longest time Claude wasn’t available in The Netherlands, but last week I got to try it for the first time. And boy, it’s impressive.
Better, faster, and cheaper
The new model is as smart, or even smarter, than their heavyweight model Claude 3 Opus. It also outperforms competitor models from OpenAI, Google, and Meta on various benchmarks.
Of course, benchmarks are just benchmarks. They tell us something useful, but shouldn’t be conflated with intelligence in the broadest sense of the word.
I’d argue the most impressive about its release is the model’s speed and inference cost. Basically, Anthropic has built a very capable model that is both quicker and cheaper to use.
Recently, price and speed improvements have become a bit of a trend, with GPT4o, Gemini Flash, and now Claude. The focus on optimization rather than huge jumps in raw capabilities could indicate we’re hitting a point of diminishing marginal returns.
Artifacts and projects
But wait, there’s more! AI chatbot Claude, powered by the new model, got some upgrades too.
The most talked feature is probably Artifacts. It’s basically a side panel next to your chat window, where Claude can write and run code for you. What you end up with is an assistant that can not just talk, but also create tiny web applications. And for anyone who isn’t a programmer, that’s really, really cool.
Finally, Anthropic is introducing Projects, you can read the full announcement here.
Projects is a way to customize your own version of Claude, similar to OpenAI’s custom GPTs (but announced without the big fanfare or “app store for GPTs”). It allows you to ground Claude in your internal knowledge, be it style guides, codebases, or transcripts.
In addition, you can define custom instructions for each Project, telling Claude to adopt a certain tone of voice or assume a specific role or persona. As a conversation designer, I cannot wait to try this out myself.
It’s all about the vibes
Going off the benchmarks and the vibes online, Claude is clearly the new frontrunner. This is a major win for Anthropic and the timing couldn’t be better, as OpenAI just dropped GPT4o and is unlikely to release a newer, more capable model prior to the one currently in training, which may or may not turn out to be GPT-5.
In terms of user experience, people also seem appreciative of the efforts made by the Anthropic team to make Claude more fun and intuitive to use. When it comes to adoption the importance of UX design is often overlooked.
At the same time, these upgrades aren’t the silver bullet for the systemic issues that have plagued language models: hallucinations, brittle reasoning, and cognitive biases like sycophancy. So, the general advice still holds true: use them wisely and at your own risk. Results may vary.
Join the conversation 🗣
Leave a like or comment if this article resonated with you.
Get in touch 📥
Shoot me an email at jurgen@cdisglobal.com.
When I want to read clear, straight to the point analyzes with original perspectives regarding updates in the field of LLMs and in general everything related to the 'behind the scenes' of AI, Teaching computers how to talk is undoubtedly among my choices . I had been curious for a long time to know how Anthropic and its new model were attracting so much attention. Thanks for this analysis.
Artifacts may be the most interesting element here, as there has been so little UX innovation with generative AI thus far. This is surprising to say given the significantly improved performance and the fact that GPT-4o, which amazed so many people, has a legitimate rival along many dimensions. We now have very good general purpose models from OpenAI, Anthropic, and Google at increasingly competitive price points. The performance/quality improvements have expanded the practical use cases. As price falls further and enterprises get past some of their internal hang-ups, we are likely to see adoption acceleration.