AI-enabled Cloud Gaming
post by samuelshadrach (xpostah) · 2025-01-18T11:56:10.037Z · LW · GW · 0 commentsThis is a link post for http://samuelshadrach.com/?file=/raw/english/unimportant/ai_enabled_cloud_gaming.md
Contents
AI-enabled Cloud gaming Latency limits of human body Internet bandwidth limits Latency limits of computers Effects on IT ecosystem None No comments
2025-01-18
AI-enabled Cloud gaming
AI-enabled cloud gaming seems like one of the hardest applications to do on cloud rather than locally. However I expect it'll get done in 10 years.
If you're a game developer you might want to work on this.
Latency limits of human body
- Video output - Most people can't distinguish individual frames in video above 90 frames per second (~10 ms / frame)
- Audio ouput - Some audio engineers on reddit find 10 ms latency when playing music digitally to be noticeable but acceptable.
- Keyboard + mouse input - Human motor reaction times are generally estimated above 100 ms. Upper bound on nerve conduction velocity is around 120 m/s, covering 1 metre of neurons from hand to brain requires >10 ms. Anticipating inputs and reacting to them can lower response time (often happens in games).
- End to end - Many cloud gamers have reported on reddit that <10 ms latency is where FPS and other action-heavy games feel as fast as playing them offline.
Internet bandwidth limits
- streaming 24x7 video requires lot more bandwidth than text/image/audio
- 1 gbps fiber connection (with no upload/download cap) is becoming increasingly popular in US, which is more than sufficient to stream UHD 90 fps video.
- Streaming 3D content directly is not possible though. VR headsets-based use cases might (?) still prefer streaming 3D content over the rendered 2D output, I haven't studied VR well enough.
Latency limits of computers
- Input/output device latency - 1 ms latency (1000 Hz) has been achieved on keyboards and mice, and gamers generally feel increased latency won't be detectable.
- Game engine, 3D rendering latency - I don't know much about this, but seems doable for most games today in under 1 ms? It depends a lot on the exact application though, there's definitely lots of 3D apps that can't be built with 10 ms latency constraint.
- Network roundtrip latency - <10 ms has already been achieved on consumer fiber connections in many US cities, there's no fundamental reason paying customers can't get this in cities across the world. Light travelling from Netherlands to California (9000 km) one-way takes 33 ms, in practice roundtrip latency is reported around 100 ms (50 ms one-way). As long as the closest datacentre is within 1000 km, there's no physical limitation on achieving <10 ms via fibre connection.
- AI inference time - As per Jacob Steinhardt, [forward passes can be significantly parallelised](https://bounded-regret.ghost.io/how-fast-can-we-perform-a-forward-pass/). 1 frame of video can be generated in under <1 ms, assuming you build an ASIC just for that specific model.
- AI inference cost - This is the biggest bottleneck. Diffusion models use maybe 0.1-1.0 PFLOP for 100 sampling steps, for one frame. At 90 fps that's 10-100 PFLOP per second of video generation. For 1 second per second output, you need a GPU cluster with 10-100 PFLOP/s. H200 is 4 PFLOP/s fp8 rentable at $2/hour. Assuming Epoch AI scaling laws of FLOP/s/dollar doubling every 2.5 years, we should get 16x more FLOP/s/dollar in 10 years, so 100 PFLOP/s rentable at $2/hour.
Effects on IT ecosystem
- If this application can be done on cloud, then almost any application can be done on cloud
- Cybersecurity and user control will be the only reasons to do things locally, performance will no longer be a reason. Financial incentives to build anything in favour of security or user control are a lot weaker than the incentives in favour of higher performance. Big Tech will no longer need to fund open source software for performance-based reasons, hence open-source software could lag behind.
- Client device could also change. End state of this vision is 99.9% of people own machines with touchscreen (keyboard+monitor) and network card (but no CPU, no disk, no RAM) and it is not practical to do anything unless you submit the job to a server. (This will probably a Big Tech server, unless small cloud is able to compete on getting low latency connections with ISPs who have inherent network effects.). See example of mobile being more locked down than desktop, but having more users. It is possible to live without a phone, but you lose access to jobs, friendships, etc. and are at a disadvantage relative to everyone else.
- This incentive structure makes it technically less challenging for the NSA (or its equivalent in your country) to get 99% surveillance over people's thoughts. As of today they need to backdoor lots of devices, routers and cables, and send whatever is useful back to their servers. This might be possible technically but requires more developer time and coordination/coercion of intermediaries to pull off.
- Incentives push in the direction of them using this data for political purposes and also leaking the data itself.
0 comments
Comments sorted by top scores.