A week in Generative AI: Claude 3, Apple WWDC, & The AI Mirror Test

News for the week ending 31st March 2024

Mar 31, 2024

cyberpunk news presenter, male, long hair, behind desk, facing camera, in clean pixel art style, neon colors --ar 16:9

A slower AI news-cycle this week with no major new announcements but Claude 3 now tops the Chatbot Arena Leaderboard, Apple announced an AI focused WWDC, OpenAI released some new Sora videos made with professional filmmakers and artists and an interesting AI mirror test was shared on Twitter.

A screenshot of the LMSYS Chatbot Arena leaderboard showing Claude 3 Opus in the lead against GPT-4 Turbo, updated March 26, 2024.

Claude 3 surpasses GPT-4 on Chatbot Arena for the first time

This is great to see - the first time that another model has bested GPT-4 in the Chatbot Arena since the arena was launched nearly a year ago in May 2023. The Chatbot Arena is the best comparison of chatbots as it pits them against each other head-to-head and gives them an Elo score, like you get in competitive chess.

Claude 3 Opus only beats the latest GPT-4 model by a couple of points, so this reinforces my view that Claude 3 is a “GPT-4 class” model and doesn’t represent the next generation of models we’re likely to see towards the end of this year. It’s rumoured that OpenAI will release a GPT-4.5 this summer, so it will be interesting to see if that retakes the crown!

Source

Apple WWDC 2024 promises to be (A)bsolutely (I)ncredible

On Monday, Apple announced their Worldwide Developers Conference (WWDC) right on time. Greg Joswiak, Apple’s SVP of Marketing, teased on Twitter that the event would be “absolutely incredible” in a not so subtle hint at what the focus of the event will be.

I’m excited for this - Apple are rumoured to announce the most radical overhaul of iOS since it was launched, which coupled with a big focus on AI could mean we’re going to see the first mobile operating system truly built for the AI-era. I don’t think Apple will go that far, but hopefully what we’ll see is a significant stepping stone in that direction.

Source

Sora: first impressions

OpenAI released the first few short films that it’s created using Sora with some professional filmmakers and artists. None of these videos will blow you away, it’s nothing that couldn’t be created with other technologies, but they were probably made in a fraction of the time at a fraction of the cost.

Source

The AI Mirror Test

This is a really interesting little test that Josh Whiton did last week. His write up on Twitter is worth a read and he makes a good point. Even if AI isn’t truly self aware, then it’s very good at closely replicating self awareness. And in practice, is there really any difference?

Source

AI Ethics News

Long Reads

MIT Technology Review - Large language models can do jaw-dropping things. But nobody knows exactly why.

“The future is already here, it’s just not evenly distributed.“
William Gibson

Discussion about this post

Ready for more?