A week in Generative AI: Gemini, Sonnet & Lyria
News for the week ending 22nd February 2026
Weâve had a week of small little releases from the big players with Gemini Pro 3.1 and Sonnet 4.6 both introduced. Google also released Lyria 3, itâs music generating model, and Anthropic released an interesting research study into Agent Autonomy.
In Web 4.0 news, Perplexity has decided it doesnât think advertising in AI platforms works after all, and in Ethics News we have a now recurring theme around how AI impacts mental health, and how companies are âAI-washingâ mass layoffs.
In Long Reads, Ethan Mollick has an update to his regular Guide to Which AI to Use in the Agentic Era and thereâs a good article from Benedict Evans on How will OpenAI compete?
Googleâs new Gemini Pro model has record benchmark scores â again
On Thursday Google bumped its Gemini Pro model up to version 3.1, and it looks like itâs a big step up in capability. Google is highlighting itâs reasoning capabilities and says that it is âa smarter, more capable baseline for complex problem-solvingâ.
Anthropic releases Sonnet 4.6
Little bump in model number for Anthropicâs Sonnet (mid-tier) model that is the default for free and pro plan users. It doubles the context window to 1m tokens, which is a big help for coding tasks and Anthropic claim that early testers preferred it to their flagship Opus 4.5 model.
Anthropic are squarely aiming the release of Sonnet 4.6 at enterprise users, highlighting its improvements in Computer Use and automating âeconomically valuable office tasksâ.
Itâs becoming increasingly clear that Anthropic is aiming itself at building out the infrastructure that enterprises will need to fully embrace agentic coworkers in the near future and their Sonnet 4.6 model is a big stepping stone in that direction.
Google introduces Lyria 3
This latest music generating model from Google looks much more steerable and sophisticated than previous offerings and is one of the few features available in Gemini that ChatGPT doesnât have. I wonder if weâll see OpenAI add music generation to its platform this year?
TBH, Iâm not sure how I feel about music generating models, in the same way Iâm unsure about image and video generating models. I think they will be amazing tools and great for creative workers to quickly play with new ideas and mock things up but I desperately donât want them to replace the skills, craft, and artistry that exists in our world. This is probably how people felt about paintings when cameras were invented and what we saw was an evolution of painting and the birth of a new art form in photography. I hope we see the same sort of thing play out this time too.
Anthropicâs Agent Autonomy study
Really interesting insights here from Anthropic on how people are using Claude Code. Unsurprisingly itâs mostly used for software engineering, but as time goes on and people have found other uses for it, weâre seeing new use cases pop up, which I suspect was the catalyst for Anthropic to release Cowork which is powered by Claude Code.
Itâs also interested to see how long Claude Code works for before stopping (c. 40 minutes, at a 99.9% success rate). For anyone whoâs seen me talk about METRâs research that measures AI agents ability to complete long tasks, this is a a great comparison point. I usually present the 80th percentile graph (at an 80% success rate - see below) which shows how AI agents have been improving exponentially over the past few years in this regard.
I think weâre soon going to be seeing agents that can successfully (99.9% reliable) complete software engineering tasks for over an hour and weâll soon start to see that trickle out to other types of tasks like the ones Anthropic highlighted in their research.
Web 4.0
AI Ethics News
Mind launches inquiry into AI and mental health after Guardian investigation
Thereâs a New Term for Workers Freaking Out Over Being Replaced by AI
Claims that AI can help fix climate dismissed as greenwashing
Race for AI is making Hindenburg-style disaster âa real riskâ, says leading expert
A roadmap for evaluating moral competence in large language models
Altman and Amodei share a moment of awkwardness at Indiaâs big AI summit
Long Reads
One Useful Thing - A Guide to Which AI to Use in the Agentic Era
Stratechery - Thin Is In
Benedict Evans - How will OpenAI compete?
New York Times - The A.I. Disruption Has Arrived, and It Sure Is Fun
âThe future is already here, itâs just not evenly distributed.â
William Gibson







