A week in Generative AI: Computer Use, Apple Intelligence & Spirit LM
News for the week ending 27th October 2024
There are three big things in generative AI this week that I think it’s worth paying attention to. The first are the announcements from Anthropic who updated their flagship Claude 3.5 Sonnet model to a new version and released Computer Use and an Analysis Tool. The second is Apple’s announcement that they’ll be releasing iOS 18.1 tomorrow which will include their first Apple Intelligence features, before we see some of the bigger features released in iOS 18.2 in December. Lastly, Meta released Spirit LM, their take on Advanced Voice mode which brings emotionally expressive voice generation to the open source community.
In Ethics News, thousands of artists have signed a letter warning of the risks AI poses to the creative industries, OpenAI has hired its first Chief Economist to help them understand the economic impact of AI, and there’s a lawsuit blaming Character.ai for the death of a 14-year old boy.
I also encourage you all to check out Ethan Mollick's great coverage of Claude’s new Computer Use feature in the Long Reads section.
Enjoy!
Anthropic introduces computer use, a new Claude 3.5 Sonnet, and an analysis tool
It’s been a big week for Anthropic with some big announcements and releases:
New Claude 3.5 Sonnet
First off, I’m not sure why Anthropic decided against updating the version number of Claude and instead just called this model ‘new’ Claude 3.5 Sonnet. For my money, this update does merit a bump in version number (not to 4, but maybe a 3.7?!) as it represents significant progress in both reasoning and coding (amongst others). It’s not o1 levels of reasoning, but it’s great progress, and these improved reasoning capabilities are a big part of what powers Computer Use.
Computer Use
This is an incredibly interesting feature, and points towards how models like Claude will become more capable of performing tasks in the future and not just able to answer questions. Essentially, Computer Use a step towards Claude being able to use computers just like a human does by looking at the screen to decide what to do and then clicking/typing as apporptiate.
I’ve played around with the demo code that Anthropic have provided and Computer Use is an interesting concept, but this is very early proof of concept work, far from a fully fledged product feature. To start, it’s very slow at interacting with a computer, which makes you wonder why you would use this feature in the first place. I guess the use case is you ask Claude to go and do a task for you and then come back later to see how it’s done?
The next big issue I found with Computer Use is that it just can’t navigate all the privacy pop-ups, cookie notices, and terms and conditions that interrupt the browsing experience so much outside of the US. This feature obviously hasn’t been too extensively tested or fine-tuned across the EU and this could be a show stopper in the future.
The last observation I have is that whilst this is an interesting proof of concept, and allows Claude to interact with sites/services/apps that don’t have APIs, it’s currently too slow and unreliable. It’s obvious to me that a much better solution will be for all sites/services/apps that want to allow AI agents (or digital companions as I like to call them) to interact with them they will need to build out better APIs to enable this. This is something I covered extensively in my Beyond Chatbots post on Integrations. This might be impractical for many sites/services/apps in the short term (hence why Computer Use exists) but I believe that as digital companions become more commonplace that competitive pressures will drive businesses to develop better APIs for them.
Analysis Tool
The new analysis tool is available for all Claude users in feature preview and essentially brings Claude closer to ChatGPT’s data analysis capabilities. It looks very slick and well designed. I haven’t had a chance to test it yet, but it’s great to see Claude catching up with some of ChatGPT’s capabilities. Next on my wishlist is memories!
Apple to release iOS 18.1 with the first ‘Apple Intelligence’ features
Tomorrow Apple will be releasing iOS 18.1, which will see the debut of some of their new Apple Intelligence features. This will include Writing Tools, Notification Summaries, and Clean Up in the Photos app.
This marks the slow, deliberate roll out of Apple Intelligence features over the next 6-9 months, which Craig Federighi explains in video above. I like this approach, and will hopefully allow Apple to avoid some of the issues other big tech companies have had quickly releasing new generative AI features and help build consumer’s confidence in the technology.
The bigger ticket items, such as Image Playground, Genmoji, and Siri with ChatGPT will be coming in the next release, iOS 18.2 which will likely come out mid December.
Spirit LM: Meta's AI division paves the way for its own Advanced Voice Mode
For anyone who’s seen one of my talks on how generative AI technologies will influence the future of marketing, you’ll know that I’m big on the idea of ‘Emotional Experience’ becoming a new marketing discipline over the next few years.
There are two versions of Spirit LM that Meta has released - a basic version and an expressive version. The expressive version adds pitch, style and emotional expression to the voice models which it can use to maintain the mood of it’s speech output.
Spirit LM is a great example of how increasingly advanced voice models are able to express emotions to the end user and I think that this capability, coupled with the ability for models to be able to read and understand the emotions of users, will usher in a whole new suite of tools that will improve how we interact with technology.
AI Ethics News
Thom Yorke and Julianne Moore join thousands of creatives in AI warning
OpenAI disbands another team focused on advanced AGI safety readiness
His daughter was murdered. Then she reappeared as an AI chatbot.
ByteDance intern fired for planting malicious code in AI models
Long Reads
One Useful Thing: When you give a Claude a mouse
Ignorance.ai: AI's invisible instructions
“The future is already here, it’s just not evenly distributed.“
William Gibson
What do you think about character ai’s part to play in someone’s death? Does this put the developers of character ai in the spotlight or does this highlight the danger of ai as a whole?