A week in Generative AI: Udio, ChatGPT and Humane
News for the week ending 14th April 2024
The ChatGPT moment for music generation has arrived with the public launch of Udio, I was lucky enough to attend Meta’s AI day in London and we saw the release of three new LLMs in 24 hours. Oh, and Humane’s AI pin final launched, which has not been well received. It’s a bit of a stinker by all accounts…
Udio is the ChatGPT moment for GenAI music
This week seems to be the ChatGPT moment for GenAI music, with the public launch of Udio, a music creation app that allows you to generate music in your favourite styles with intuitive and powerful text-prompting.
The launch immediately overwhelmed Udio’s servers, which led to the company posting an amusingly musical post on Twitter announcing that their website was down.
I haven’t been able to play around with Udio much yet, but I’ve seen lots of examples shared and most commentators seem to be incredibly impressed with its capabilities. Worth checking out!
Meta AI Day London
I was lucky to attend Meta's AI Day in London earlier in the week where the future of AI at Meta was brought to life.
Joelle Pineau, VP of AI Research, kicked off the day by highlighting the incredible strides Meta has made in AI, with a 95% rate of harmful content being removed by AI before it even reaches the user. The commitment to innovation, an open approach via tools like PyTorch and Llama, and the emphasis on responsible innovation set the tone for the day.
There was then an AI panel discussion, moderated by the insightful Dr A-Marie who brilliantly hosted Meta's top brass including Chris Cox, Yann LeCun, Joelle Pineau, and Nick Clegg. The panelists delved deep into how AI is being integrated into Meta's products, used by billions of consumers globally, and the importance of open source AI development.
The day then finished with a keynote presentation from Yann LeCun where he presented his vision for Objective-Driven AI Systems. The key takeaways from this were that there are some fundamental capabilities missing from today's auto-regressive LLMs:
- They are not grounded in the real world
- They are not up-to-date
- They are not good at reasoning or planning.
In Yann LeCun's words - LLM's suck!
What's needed to deliver general artificial intelligence is an Objective-Driven AI system that is fundamentally different from LLMs. These systems have a world model, built from multimodal sensory data, that is constantly learning in the same way humans do (it's a bit more complicated than that!)
Here’s a link to a version of Yann's talk on YouTube as well as his excellent interview last month with Lex Fridman for those that want to learn more.
Google announces lots of AI tools and services at Google Cloud Next ‘24
Lots of news from Google this week from their Cloud Next event. In his introductory video, Sundar Pichai emphasised the deep investments in AI infrastructure that they’ve been making for the last 10 years. Google undoubtedly has the biggest global AI infrastructure, one of the key battlegrounds between the frontier AI companies.
After a surprising preview earlier in the year, Google launched Gemini 1.5 Pro which has a huge 1m token context window. Google also talked a lot about grounding their GenAI models with Google search and how they enable this in enterprises with retrieval augmented generation (RAG).
Google also shared a lot of demos of Gemini integrated with Google Workspaces Gemini Code Assistant, and Looker, amongst others. This is where the huge context window of Gemini 1.5 Pro really shines, as it’s able to include a huge amount of data when answering questions.
ChatGPT gets a mysterious upgrade
Within 24 hours this week, Google released Gemini 1.5 Pro, Mistral released their latest mixture of experts model and OpenAI announced a mysteriously vague update to ChatGPT, which is no powered by their latest GPT-4 Turbo model.
This isn’t the expected release of GPT-4.5 but does return OpenAI to the top of the Chatbot Arena Leaderboard, having lost their number one spot to Anthropic’s Claude 3 a month ago.
The new released was just described on Twitter by various OpenAI employees as ‘smarter’, ‘more pleasant to use’, or ‘majorly improved’. No other details were shared.
I think as the market matures we’ll start to see more point releases (i.e. GPT-4.1) as opposed to just major (GPT-5) and semi-major (GPT-4.5) releases which will hopefully come with more details on improvements made.
The good, the bad, and the Humane Pin
Humane’s AI pin is the first genuine standalone AI device that’s launched, and unfortunately it’s not good. The reviews have been scathing, with lots of bugs and head scratching as to what the point of it really is.
I’m not sure the future of AI hardware is a separate device from a mobile phone as I think most AI use cases can be delivered by new mobile phone software that we’ll start to see a glimpse of at Apple’s WWDC this June. Let’s see…
AI Ethics News
‘Time is running out’: can a future of undetectable deepfakes be avoided?
Meta - Our Approach to Labeling AI-Generated Content and Manipulated Media
OpenAI and Google reportedly used transcriptions of YouTube videos to train their AI models
A new bill wants to reveal what’s really inside AI training data
OpenAI fires key researchers for allegedly leaking information
Long Reads
Stratechery - Gemini 1.5 and Google's Nature
One Useful Thing - What just happened, what is happening next
“The future is already here, it’s just not evenly distributed.“
William Gibson