A week in Generative AI: Claude 4, Google I/O & io
News for the week ending 25th May 2025
I’m starting to see a pattern. Twice a year, in May and November, we have weeks like this in AI where there is a huge number of new models, products, and capabilities dropped by most of the leading frontier AI firms. I think it’s partly because we seem to have settled into a 6-monthly cadence of new capability releases that move things forward (Apple take note, annual WWDC won’t cut it anymore, things are moving too quickly!) and partly each of the frontier AI firms wants to disrupt and one-up each other. This is certainly the case with OpenAI and Google!
So this week we’ve seen Anthropic launch their next generation model, Claude 4 Opus and Sonnet, Google have their annual I/O conference where they dropped a lot of new products, and OpenAI announced the acquisition of Jony Ive’s io design firm. Not much to cover then!
There’s also been plenty of news and reports on the Ethics front which I’ve sprinkled throughout my coverage of the big announcements. It’s also been reported that SAG-AFRA have complained about the late James Earl Jones’ voice being used for Darth Vader in Fortnight (which he gave permission for in his will) and a report that AI could account for nearly half of datacentre power usage ‘by end of year’.
There are lots of Long Reads worth checking out this week too. Stratechery have a great piece on the challenges coming with the Agentic Web and how it will break the ad-funded economic model of the internet. There’s also a musty read from Ethan Mollick for anyone facing challenges with the organisational adoption of generative AI.
It’s been a big, important week in AI for lots of reasons, so take your time with this one!
Anthropic launches Claude 4 Opus & Sonnet
Ever since Anthropic released Claude 3.5 Sonnet back in June last year we’ve been waiting for this release. Sonnet was always the ‘middle’ model in the Claude family and we never got the 3.5 release of their larger, more capable Opus model. That changed this week with the release of Claude 4 Opus and Sonnet.
These two new models double-down on Anthropic’s specialty in coding and organisational use, as well as their high-taste personality and creative writing skills. The headline feature of Claude 4 Opus is its ability to perform long-running tasks that require thousands of steps and take hours to accomplish. I don’t think this is as relevant now as it will be in 6 months time once Anthropic (and others) have built out the frameworks and tools Opus will need to be able to perform long-running tasks. There’s only so much time it can spend searching and researching on the internet without taking more meaningful actions!
Whilst Opus is the headline model, and the most capable at coding, its Claude 4 Sonnet that will do most the the heavy lifting when it comes to writing code as it’s cheaper and faster. I’ve been testing it out this weekend via Claude Code and I’ve seen significant improvements in practical everyday coding use over Claude 3.7 Sonnet. In all honesty, I was about to take the plunge and upgrade my ChatGPT subscription to the $200 per month Pro tier so that I could start using Codex, but I’m now planning on sticking with Claude Code - Claude 4 Sonnet is great! The new integration of Claude Code with VS Code is also particularly useful.
So, I’d say this has been a great launch from Anthropic that double’s down on their strengths and lays the groundwork for future agentic capabilities that allow the models to perform long-range tasks once the scaffolding has been built out for them to do this. Below is some more coverage and commentary about this week’s release if you want to read more:
Anthropic’s new Claude 4 AI models can reason over many steps
A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model
Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji
Anthropic’s new AI model turns to blackmail when engineers try to take it offline
Google I/O
Google announced a huge amount of new products and features at I/O this year. Similar to last year, below are my highlights:
Veo 3 is Google DeepMind’s answer to Sora and some of the other independent text-to-video labs. But this is the first time we’ve seen video and audio generation combined into one product. This is a great step forward, and deservedly got much of the social media attention, but is still only able to generate short clips so has limited real world application. You also need a $250 per month Google AI Ultra subscription to access it.
Agent Mode hasn’t been released yet but is described as a new experimental capability where you can state an objective and Gemini will then use lots of different capabilities together (such as search, deep research, and integrations with Google apps) to achieve the objective. This means it will have to plan and manage complex multi-step tasks from start to finish with little human input. This sounds great, and I’m glad they’ve labeled it as experimental as I’m not sure we quite have the models to be able to achieve this level of sophistication yet. We probably will by the end of the year though.
AI Mode is the search product Google would love to build if they weren’t so worried about destroying their traditional search business. AI Mode allows users to ask follow up questions, deeply research, and compare products. It’s very much a reaction to ChatGPT’s growing search capabilities and Perplexity. AI Mode is only in the US for now and isn’t the default search experience, so it’s unclear how many users it will have. It has also raised quite a bit of controversy already with news publishers calling it ‘theft’. It’s worth checking out Ben Thompson’s (Stratechery) commentary on this where he goes into how the agentic web will largely break the ad-supported model of the internet.
Personalised Smart Replies will probably be the most practically useful new feature that Google announced for most people. It’s a vast improvement on Gmail’s smart replies where it can now type in your tone/style and also bring in information across Google’s suite of productivity tools to help.
Project Aura is Google’s answer to Meta’s AI glasses that they partnered with Ray-Ban on. This seems to be quickly becoming the next big tech battleground and I’m sure we’ll see a similar pattern play out with Apple joining the party late, but ultimately cracking the design and user experience. Although with the current state of Apple Intelligence, they may need to partner on the AI smarts (which would be a very un-Apple like thing to do).
Here are some other good commentaries and coverage of everything Google announced at I/O which are worth a read through if you want more details an implications on everything they announced this week:
Jony Ive to lead OpenAI’s design work following $6.5B acquisition of his company
Well, I did not see this one coming - colour me both excited and simultaneously sceptical!
There have been plenty of reports over the last 12 months of Jony and Sam working together to develop some AI hardware, and it’s clear that OpenAI is on course to become the next big consumer platform. I think what Sam is aiming for is something of a Google/Apple hybrid - a huge multi-billion user digital platform combined with a huge hardware user base and subscription/services business.
I (wrongly, along with many others) assumed that Jony was only providing consultancy, advice, and concepts for OpenAI’s hardware device. Turns out that he and his team are not only going to design and build that hardware device but now that they’ve been acquired will become OpenAI’s internal design team across everything that they do.
The general vibe is that the new device they’re working on is screen-less and isn’t a wearable or smart glasses. This is what makes me sceptical as I don’t believe the UI/UX solution we’re looking for with AI is screen-less, and I don’t think there will be a screen-based device that replaces or lives alongside the smartphone - its hear to stay.
But, OpenAI are (currently) the biggest and most successful AI company there is and Jony and his team are the most aesthetically (IMHO) and commercially (objectively) successful group of design individuals that we’ve ever seen. The two coming together is incredibly exciting and I’m actually more excited about what they can do to move the basic chat user interface on (which is sorely needed) and something a little bit longer term….
I don’t think Jony would have sold io to OpenAI just to make a personal AI device that may/may not be just like the failed Humane AI pin and/or the Rabbit R1. My money is on this being the start of OpenAI reviving their robotics ambitions. I don’t think we’ll hear much about this until next year, but I think robotics is something that would get Jony re-invigorated and give him something new to really sink his teeth in to.
Here’s to hoping!
Tech CEOs are using AI to replace themselves
This week both Klarna and Zoom’s CEOs used AI avatars to replace themselves during their quarterly earning calls. This feels like a bit of a turning point.
Firstly, it’s worth commenting on the quality of the AI avatars, which is incredibly impressive. However, I’ve spent a bit of time with this technology myself, and you can tell that whilst the video of Siemiatkowski (Klarna) is AI generated, the voice over is not. AI avatar technology is notoriously bad at accented English voices and it’s clear that his voice was dubbed over the top of the video as the lip sync is a bit off. When you compare that with Yuan’s (Zoom) video you can hear the deadpan effect on the voice which is a clear giveaway that it is 100% AI generated.
Whilst both companies are trying to prove something by using AI Avatars (Klara has been notoriously bullish on AI, and Zoom is showing off its own technology), the fact that they’re used on something as serious (and regulated) as an earnings call is telling. On one hand, the AI Avatar was used for the opening remarks, which are always scripted in advance, so it makes some sense. On the other hand, these are the leaders of publicly listed companies who are sending a message that it’s ok for an AI Avatar to stand in for you during an important call.
Genuinely not sure how I feel about this one.
AI Ethics News
Are Character AI’s chatbots protected speech? One court isn’t sure
AI could account for nearly half of datacentre power usage ‘by end of year’
Long Reads
Stratechery - The Agentic Web and Original Sin
One Useful Thing - Making AI Work: Leadership, Lab, and Crowd
Bloomberg - Anthropic Is Trying to Win the AI Race Without Losing Its Soul
The Guardian - The rise, fall and spectacular comeback of Sam Altman
Ignorance.ai - SEO for AI: A look at Generative Engine Optimization
Simon Willison - I really don’t like ChatGPT’s new memory dossier
“The future is already here, it’s just not evenly distributed.“
William Gibson
Somewhat unrelated question but forgive me as I am fairly new to ai and still learning, Why does the top tear of chat gpt cost so much? Is it due to its high development cost?