2084: Is sleep important and some notes on Generative AI for Games
Briefly looking at some interesting models out there for video games, and sleep
It is 4h20 in the morning, since I’ve been working since late, but at least I’m mostly done. It’s interesting how easy it is to work later and later, especially if you’re doing interesting work. I was working on a problemset for Math 526, Differentiable Manifolds, which is a great class if you like abstraction so high that you get vertigo. Thinking about covectors is interesting though.
After going through too many definitions to count, I’ve finally realized what they are, basically two different ways of defining differentials. A vector is the tangent to a curve, how a curve changes. It is essentially the differential of a function y: R → M. A covector is the other way around - it is a nice way to think about how a multivariate function changes with regards to its vector input, or in other words the differential of f: M → R. These are so symmetrical that it’s easy to see why they’re dual and form very closely related phenomena - in fact they’re both just vector bundles.
But anyways, working on this took quite a few hours, and I was wondering therefore about how common it is to work this late. At Yale, and most high-intensity jobs, and places, you’re busy the whole day, so the night is the only time you have to work on projects which require uninterrupted focus. Paul Graham in one of his essays talks about manager vs maker time, where managers can operate on 30-minute schedules since organization and social meetings don’t take long stretches of undivided attention, while programming and most creative crafts take hours and hours of dedication and focus to do properly. You can have 10 meetings in a day, but a program can take an afternoon. If you have some aspects of both in your life, in where you have to do a lot of short meetings, which is what classes are in a sense, and you have projects which require long term focus and commitment, it’s too easy to work later and later. I’m not sure that this is really a solvable phenomenon, especially if you’re the type of person who would take on so much work. Fundamentally, the night owl setup is a reaction to this tension between the two types of work.
Talking about video game development especially, there’s a lot of late nights in that. There’s a recently released model called DreamFusion, which can convert text into 3D models. They use a 2d text to image model but in such a way that spacial consistency is preserved and then project that to 3D by pruning down a randomly generated 3D model till it’s consistent. It’s genuinely impressive in the models it generates, and it also shows how consistency, the original bugbear of text to image, has become better and better. Now just to have nicer tools.
Now, beyond 3D models, animation also has its own Text to Animation models. Motion Diffusion, developed by Tel Aviv university, can now convert a text description into a rather good animation. It’s smooth and not too bad, and shows the power of diffusion models - and this has only been a few months, think about what will be done in 2 years.
Making things more realistic will also be easier, as a video on my favorite channel, Two Minute Papers shows. Intel managed to convert game footage in real time into a much more realistic simulacra. It’s genuinely impressive. There’s also an interesting video about an AI that could learn how to do physics simulation from just watching videos.
This of course raises an interesting question: While these models are all limited as is, what if in the future, games would be completely procedurally generated by these AI models? Imagine a game where the visuals are produced by an AI, the sound by an AI, the physics by an AI, and the animation produced by an AI, with an AI performing overall control and dynamics. It’ll essentially be an entirely new game everytime you play, and you could do whatever you want in the game, since whatever you create will be newly made by the AI, in response to your desires. Procedural Generation on steroids. Imagine an RPG that never ends, or a shooter with infinitely large and real levels, that reacts dynamically to whatever you do. Worlds so realistic that it beats reality. If VR ever takes off, imagine a VR world that dynamically responds to whatever you do, in which you genuinely can do whatever you want. I think that all previous sci-fi imaginings of this have been too small thinking. The potential is massive. In 2084, we might not even understand how people could have played such limited games as we did today.