Darwin Gödel Machines, the Gentle Singularity and V-JEPA 2 – Live and Learn #70

Welcome to this edition of Live and Learn. This time with a novel design for self-improving AI coding agents, an essay by Sam Altman about how AGI is already here, and a novel interactive world generation model by Odyssey. As always, I hope you enjoy this edition of Live and Learn.
✨ Quote ✨
If you want to do anything interesting, and probably weird, you will immediately be besieged, from within and without, by tedious people and tired ideas.
– Nikhil Suresh -(source)
Links
The Gentle Singularity by Sam Altman. According to Sam Altman, the singularity has begun. With the current techniques and LLMs, we have kick-started the way to superintelligence, and so far, it hasn't been as disruptive as we have been told by science fiction novels. It's a gentle singularity. Yet, the pace is about to pick up, and we are going to see even more capable models very soon. I am not sure how much of this is marketing and stoking up the hype cycle. But in the end the last few years have indeed been quite wild in terms of the capabilities that our computers have gained. Let's hope that the future change keeps being that "gentle".
Darwin Gödel Machines by Sakana AI. The research that Sakana AI has been putting out lately has been nothing short of nuts. First their AI scientist, then the invention of Continuous Thought Machines and now this. They are not yet one of the household names like OpenAI, Google, or Anthropic, but I can totally see a future where they "accidentally" stumble into a new paradigm that leads to a rapid takeoff scenario. Darwin Gödel Machines might be the first glimpses of such a paradigm shift. Sakana AI essentially built a self-improving coding agent that uses an evolutionary algorithm to search through different adaptations of its own source code. The new code is generated by a foundation model, like Claude 4 or Gemini 2.5. The Darwin Gödel Machine then evaluates those adaptations and recursively uses the best ones to re-add to itself, increasing its problem-solving capabilities by adding code for better tooling, workflows to prompt the foundation model, and so on. This leads to the generation of an ever-improving lineage of coding agents where each generation has slightly better SWE bench scores. The scary part is that this system is showing real signs of recursive self-improvement, but also many of the problematic behaviours that the AI safety community has been warning about for years. Sure, they only run this in secure sandboxed environments, strictly disconnected from the internet, but I wonder if it is still possible for such a piece of self-improving code + LLM to "escape" through some flaw in the security. Let's hope this doesn't change the gentleness of the takeoff that Sam Altman seems to believe in.
Interactive Real Time AI Video by Odyssey. This company first built a carryable 3D scanning setup with cameras that reminded me of Death Stranding and then used that to collect a lot of real-world 360° video data. They then used this dataset to train a novel world model. This world model can now generate interactable video, i.e. you can control the direction that the camera is moving, and the AI will generate the video as you move around. For now, this looks like a toy, like a "glitchy dream", especially compared to the quality of other video models like Veo3, but the fact that this is interactable and can produce real-world scenes, not only gameplay scenes like Muse or Genie, is interesting. The people at Odyssey claim that this will become an entirely new form of entertainment eventually. Something else, something between videos and video games, the first version of the holodeck. Let's see ^^
V-Jepa 2 by Meta. Meta has announced their newest world model that can anticipate and predict actions from videos. It can analyze footage and understand what is going on: the causal relationships, and physics. From this understanding, V-Jepa can even extrapolate what will happen next. This is something very useful for developing more advanced robotics, and Meta made the weights of the model openly accessible. What they are aiming for is to provide a basic understanding of the physical world and how it behaves. Something that every 5-year-old has, but that is hard to build into AI. Things like the understanding of object permanence, or the expectation of how gravity works, or how heavy things are based on what they look like, and so on. All these physical intuitions that we often take for granted are what make robotics hard and lead to Moravec's Paradox. With V-Jepa 2, Meta moved a little bit closer to that goal post of giving robots a robust physical understanding of the real world only from video data. Meta has more information in their paper and their blog post announcement, too.
🌌 Travel 🌌
Everything breaks on my trip: my phone, my smart glasses, my drone batteries... and most of all my bicycle 🙈 One day I am climbing up into the beautiful paramo of Ecuador, the next, I have to stop for a couple of days because the aluminum of my luggage rack snapped. Not to mention the constant barrage of flat tires, I experience. But I fix things, and then get back into the mountains, climbing up and up through the beautiful landscapes. But it's never more than a couple of days before the next thing breaks. And then it is back down again, hitchhiking with a jeep and taking a bus to the next biggest town, trying to get it fixed. Then, getting sick and having to rest for a week, lying in bed all day with a sore throat and fever. It feels like I am not cycling at all, but hey, better to be sick here than camping somewhere on a mountaintop.
But I'm finally better now and will continue the TEMBR and I look forward to climbing up into the mountains once again 😊 there's a beautiful section with windswept landscapes waiting at the top and I'm excited to finally get there 🙃 let's hope that I am done with repairs for the next month and can cross Ecuador without any further problems. There might even be the option to cycle with another traveler for a couple of days, and the thought of like-minded company makes me happy ^^
And the landscapes and encounters on the road also make it all worth it in the end. Pictures can't capture the gargantuan nature of the scenery. Neither can pictures capture the feeling of pride and accomplishment and sheer bliss that I feel when looking back down into the valley, following the winding road with my gaze and thinking to myself: "wow, I cycled up all this way, how incredible".
🎶 Song 🎶
#2 by Nils Frahm
That's all for this time. I hope you found this newsletter useful, beautiful, or even both!
Have ideas for improving it? As always please let me know.
Cheers,
– Rico
Subscribe to Live and Learn 🌱
Join the Live and Learn Newsletter to receive updates on what happens in the world of AI and technology every two weeks on Sunday!