A week in Generative AI: OpenAI, 2024 & Robots
News for the week ending 5th January 2025
Welcome back and hello 2025! After such a frantic end to 2024, it’s no surprise that it’s been a slow start to 2025 on the GenAI news front. However, there were a few things shared right at the end of last year, that are worth commenting on, and there have also been a couple of great reviews of 2024 from Simon Willison and Jim Fan that I wanted to share.
On the Ethics front, there’s a report that estimates that Microsoft will be spending a whopping $80bn on AI infrastructure this year alone and research looking at how AI may soon manipulate people’s online decision-making. Being in the advertising industry, I’d argue that started happening a long time ago, but that’s a post for another time!
In Long Reads there’s also a great article from Francoise Chollet, the founder of ARC-AGI on o3’s amazing performance against their benchmarks. I’ve also shared Anthropic’s article on how to build effective agents, which I’m sure we’ll see much more of this year!
OpenAI confirms plans to become a for-profit company
This move to a for-profit company by OpenAI has long been rumoured and on the cards. I’m a little cynical of the timing of the announcement - the news was dropped between Xmas and New Year's, and it’s certainly a controversial decision.
On one hand, I do understand the reasoning, as developing cutting edge AI capabilities is an incredibly expensive business and pivoting to a for-profit company will allow OpenAI to raise far more investment than under their current corporate structure, which is a bit of a mess.
OpenAI’s new corporate will be as a public benefit corporation, which can make profit but has a stated mission to ‘produce a public benefit‘. However, for me this is a little loose, and I think a new approach to corporate structures needs to be worked out for large frontier AI companies that are capital-hungry but also have the potential to severely disrupt society if they are successful in their mission.
Simon Willison’s things we learned about LLMs in 2024
This is a fantastic post from Simon Willison that does a great job of summarising everything we learnt about GenAI technology throughout 2024. Below are some of the highlights:
The GPT-4 Barrier - we went in to the start of the year with no one outside of OpenAI able to build a model with the capabilities of GPT-4. We exit the year with c.70 models that rank higher than the original GPT-4, some of which can run locally on a laptop.
Costs crashed - Not only did capabilities across the board improve, but prices dropped significantly too, with today’s state-of-the-art models 12x cheaper than the state-of-the-art models of a year ago. This also means the environmental impact of these models has greatly reduced too (although there are still lots of challenges on this front).
Multimodal is the future - We barely had multimodal GenAI models in 2023 and we still haven’t seen their full capabilities or worked out all the amazing use cases they will have. Models that can see, hear, and speak in real time will deliver a fundamental shift in how we interact with technology. Everyone should give the voice and live video features of ChatGPT and Gemini a try - it’s science fiction come to life.
Freemium access is likely dead - We have probably seen the last of free access to state-of-the-art models as the capital costs of building them starts to bite. There was a wonderful time this year where anyone could access the best models for free, with usage limits, but that seems to have passed which is very sad.
‘Agents’ haven’t happened yet - The term ‘Agents‘ is badly defined and needs more nuance (this is one of the reasons why I wrote a whole series of posts about Digital Companions), and we’ve seen a lot of promise shown throughout 2024. However the technology just isn’t reliable enough yet to be widely deployed. Reliability is probably at around 80% for frontier agentic models, but this probably needs to get to 99% before being publicly released and for consumers to start being able to trust it.
Reasoning models are the new frontier - Models like o1, o3, and others are the new kids on the block and where we will see the most (and fastest) progress this year. They are the models that will close the reliability gap to 99% and power true ‘agent’ like experiences for consumers as well as drive new approaches to scientific research.
Education is more important than ever - There is a huge gap in knowledge and experience between those that regularly use GenAI models and those that don’t. I fear this will become even more of an issue as free access to state-of-the-art models disappear and a big gap opens up between what’s available to most people (for free), and what’s available to those that can/are willing to pay.
There’s lots of other great commentary from Simon in his post, so I highly encourage you to check it out!
Jim Fan’s Thoughts on 2024
Jim Fan (of NVIDIA fame) also shared his thoughts on 2024 that I wanted to share and cover. Below are some of the highlights:
Robot Hardware - we’ve seen amazing progress in robotics in 2024 and we are highly likely to be the last generation of humanity to grow up without advanced robotics being everywhere. This will have profound implications.
Computer Hardware - this has continued to scale at pace, sees no signs of slowing down and quantum computing is quickly becoming more practical which means we’re not going to see a slow down in increasing processing power anytime soon.
World Models - the Sora and Veo text-to-video models are the start of a new type of model that learns how the real, physical world works. This will lead to more intelligent models, an acceleration in robotics, and the rise of incredibly realistic interactive experiences.
Language Models - the user interface for frontier LLMs hugely lags behind the capabilities of the models. A huge amount of UIUX design work is needed to fully unlock the potential of cutting edge GenAI technology (a sentiment also echoed by Simon Willison in his post)
Jim then ends with a great quote from Edward O. Wilson, a Harvard professor and Pulitzer-prize winning author that I really have to share:
“The real problem of humanity is the following: We have Palaeolithic emotions, medieval institutions and godlike technology”
Again, all of Jim’s thoughts are well worth a read and are split in to three parts on Linked in below - check them out:
Unitree B2-W extreme testing
This impressive video was released at the end of last year by Unitree, a Chinese robotics company. Their B2-W looks incredibly rugged, versatile, and strong, able to carry a person! It’s also got some amazing balances on both all four limbs and just two of its limbs.
This is a great demo that’s shows a slightly different side of robotics from the usual humanoid robot demos that have been doing the rounds in the last year - it’s great to see a robot that is able to move at speed and perform some impressive manoeuvres!
AI Ethics News
Microsoft to spend $80 billion in FY’25 on data centers for AI
Meta is killing off its own AI-powered Instagram and Facebook profiles
OpenAI failed to deliver the opt-out tool it promised by 2025
AI tools may soon manipulate people’s online decision-making, say researchers
ChatGPT search tool vulnerable to manipulation and deception, tests show
How AI revolution could help benefits appeals and landlord disputes
Long Reads
Francois Chollet - OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
Simon Allison - Things we learned about LLMs in 2024
Jim Fan - Thoughts on 2024
Anthropic - Building effective agents
“The future is already here, it’s just not evenly distributed.“
William Gibson