🕓 3 min✏️ Published on September 8, 2024

Diffusion Models as Game Engines, 100M Context Windows and AlphaProteo – Live and Learn #50

Welcome to this edition of Live and Learn. This time with a way for distributing training of AI models among many people and machines over the internet, a new model that can predict the binding proteins for specific target sites, and the announcement of gargantuan context windows. As always, I hope you enjoy this Edition of Live and Learn!

✨ Quote ✨

The solar panel is wizardry manifest. It literally prints energy from free shit that falls out of the sky.

– Ben James - (source)

Links

100M Context Window by Magic. There is a new approach out there called LTM (Long Term Memory) models. These models are developed by a company that is literally called "Magic". The idea is to use smaller models with a different architecture to enable bigger context windows. This way the model can hold factual representations of the context much better. This makes it more useful and factually accurate. Holding bigger context windows without consuming too much VRAM at inference time is very challenging though and also important enough for the future of LLMs that companies like Magic are worth watching out for. I like the description of themselves in their hiring section: "We are 23 people (+ 8000 H100s) and are hiring more Engineers and Researchers to accelerate our work and deploy upcoming models." If a startup mentions how many H100s they have, I think they are serious about their work and I am excited what they will do in the future.

Distributed Training Runs for Giant Open Source LLMs by NousResearch. In this open-sourced preliminary report, NousResearch is proposing a novel architecture for distributing training runs across many different GPUs across the internet. This is useful to scale training across many individual, less powerful machines, which would help to distribute training costs among many people collaborating. This could lead to a world where people collaborating to pool their compute together might be able to produce bigger and better models than any single tech company of their own. I am excited to see where this goes and hope that open-source AI models might become the norm in the future because of developments like this.

AlphaProteo by DeepMind. DeepMind used their work on AlphaFold to create an even more advanced model that can predict the structures of novel proteins that can bind to specific sites on existing proteins. This is a big step forward for AI-assisted drug discovery because people can now use models like this to target specific proteins found in pathogens to attack or disable them with new medical compounds.

Replit Agent by Replit. Replit has announced their automated software engineer called "Replit Agent". It's a wrapper around LLMs like GPT4-o and it can read and write code, browse the web, and create and deploy full-stack applications. It first makes a step by step plan of what to do and then executes it, writing code and troubleshooting issues automatically. At any point in time, you can go in and help it get unstuck or give further directions. The docs for it can be found over here.

Diffusion Models are Real-Time Game Engines by Google Research. In this work, researchers have used diffusion-based models to "run" Doom. Every frame is produced–not by a traditional rendering pipeline–but by a Diffusion model "dreaming" up every next frame as the user plays instead. While this is only a proof of concept for now, the idea is still very interesting to watch out for. Because at some point in the future diffusion models dreaming customized reactive worlds might become the state of the art. Then, instead of continuing to code and optimize game and physics engines, you only have a neural network dreaming up the entire simulated game world on the fly. If this will ever happen, remains to be seen, but research like this suggests it as a potential option.

🌌 Traveling 🌌

I spent the last two weeks in the forests of Northern Portugal in a place called the Garden. Much of the magic of the place I was at is not possible to be captured in pictures I think. It's about the people and the connections formed here. The cuddles, the live music, the food, the conversations. But the nature around here is still pretty beautiful and I have some pictures to share.

🎶 Song 🎶

Slow Soul by Sticky Fingers

Youtube Music | Spotify

That's all for this time. I hope you found this newsletter useful, beautiful, or even both!

Have ideas for improving it? As always please let me know.

Cheers,

– Rico

Subscribe to Live and Learn 🌱

Join the Live and Learn Newsletter to receive updates on what happens in the world of AI and technology every two weeks on Sunday!

✨ Quote ✨

🖇️ Links 🖇️

🌌 Traveling 🌌

🎶 Song 🎶

Subscribe to Live and Learn 🌱

Links