Why does Argus use sharding to build a full-chain game infrastructure?

Question

原标题: How I Learned to Stop Worrying & Love ution Sharding

Video link:

Speaker: Scott Sunarto (@smsunarto) on Research Day

Article editing and finishing: Justin Zhao (@hiCaptainZ)

Hi, I'm Scott (@smsunarto), founder of August Labs (@ArgusLabs_). Today, I'm going to talk about a topic we haven't touched on in a while. With roll-ups becoming the mainstream of the times, we don't discuss execution sharding as much as we do data sharding. So, let's revisit this somewhat neglected topic - execution sharding.

This will be an easy conversation. I know you've been hearing complex concepts all day, so I'll try to make this discussion as practical as possible. I prepared a suitable slide design for this presentation.

For those who don't know me, the funny thing is, I'm known on Twitter as an anime girl. I also missed my college graduation just to be here, much to the dismay of my parents. Currently, I am the founder of Argus Labs. We see ourselves as a gaming company, not an infrastructure or cryptocurrency company. One of my biggest pet peeves is that everyone in crypto gaming wants to build tools, but no one wants to create content or apps. We need more apps that users can actually use.

Previously, I co-created Dark Forest (@darkforest_eth) with my smart friends Brian Gu (@gubsheep) and Alan Luo (@alanluo_0). Brian is now running 0xPARC (@0xPARC) and he's a lot smarter than me.

Today's discussion will focus on performing sharding, but in a context most people are unfamiliar with discussing performing sharding. We usually discuss execution sharding in the context of Layer 1, such as Ethereum sharding or Near sharding. But today, I want to change the context a bit. Let's think about what sharding would look like in a roll-up environment.

The basic question here is why would a game company build its own roll-up, and what can we learn from World of Warcraft to design roll-ups. Additionally, we'll explore how the design space for roll-ups goes far beyond current realities.

To answer these questions, let's go back to 2020, when the idea of the Dark Forest was first conceived. We asked ourselves, what if we created a game where every game action was an on-chain transaction? The premise was ridiculous back then, and it still is to many people today. But it was an interesting hypothesis, so we built it, and Dark Forest was born.

Dark Forest is a full-chain space exploration MMORTS game based entirely on Ethereum, powered by ZK-Snarks. Back in 2020, ZK was not as popular as it is today because there was hardly any documentation. The only available documentation for Circom is Jordi Baylina's (@jbaylina) Google Docs. Despite the challenges, we learned a lot along the way, and Dark Forest is the embodiment of those learnings.

Dark Forest is a bigger experiment than we thought. We have over 10,000 players, trillions of gas spent, chaos in the game, people stabbing in the back on the chain. The most fascinating thing about Dark Forest and on-chain gaming is the platforming nature. By having a full-chain game, you open the door to design space for emerging behaviors, allowing people to build smart contracts that interact with the game, as well as alternative clients and game modes, such as Dark Forest Arena and GPU miners.

However, with great power comes great responsibility. When we launched Dark Forest on xDai, now known as Gnosis Chain, we ended up filling the entire block space of the chain. This makes the chain basically unusable for anything else, including DeFi, NFTs, or any other xDAI thing.

So what now? Have we reached a dead end? Will full-chain games never become a reality? Or are we going to go back to making games where only small JPEG pictures are on-chain and convince people that money grows on trees? The answer is, we let software do things. Many of us have a very rigid view of blockchain and roll-ups, as if there isn't much room for improvement. But I disagree. We can experiment and find new possibilities.

We asked ourselves a question: if we were to design a blockchain from scratch for games and only games, what would it look like? We need high throughput, so we need to scale reads and writes. Most blockchains are designed to be write-heavy. Transactions per second (TPS) is a metric that people brag about, but in reality, reads are just as important. How do you know where a player is if you can't read from a blockchain node? This is actually the first bottleneck we found in blockchain construction.

Dark Forest has a problem where full nodes are heavily used and I/O explodes because we need to read data from the on-chain state. This resulted in thousands of dollars in server costs, which were generously covered for us by the xDAI team. However, this is not ideal for the long term. We need high throughput, not only for transactions written per second, but also for reads, such as fetching data from the blockchain itself.

We also need a horizontally scalable blockchain to avoid the Noisy Neighbor problem. We don't want a popular game to suddenly start crashing on the blockchain, stopping all work. We also need flexibility and customizability so that we can modify the state machine to be designed for the game. This includes having a game loop, making it self-executing, etc.

Last but not least, for those who are not familiar with the architecture of online games, this may be a bit vague, we need high tick rate. Ticks are the atomic unit of time in the game world. In the context of blockchains, we have blocks as atomic units of time. In games, we have ticks. This is almost similar when you build a full-chain game, where the tick or block generation rate of your blockchain is equal to the tick of the game itself.

Therefore, what we need is a blockchain with high throughput, horizontal scalability, flexibility and customization, and high tick rate. Such a design can meet the needs of the blockchain we designed from scratch for the game.

If you have a higher tick rate or more blocks per second, the game will feel more responsive. Conversely, if your tick rate is low, the game will feel sluggish. One key thing to remember is that if chunks are delayed, you will experience a noticeable lag in the game. This is a bad experience. If you've ever dealt with angry players yelling at the computer for losing a game, that's an absolutely terrible situation.

Currently, our rollups have one block per second, which is equivalent to one tick. If we want to have cooler games, we need higher tick rates. For example, Minecraft, a simple pixel art game, has 26 ticks per second. We're still a long way from building a game as responsive as Minecraft.

A possible solution is to deploy our own rollup. While it appears to fix the problem, it doesn't actually fix the root cause of the problem. For example, you'll have higher write throughput, but not quite to the level that games need. Of course, if your game has a hundred players, this will be enough. However, if you want to build a game that requires higher throughput, there are very strict constraints due to the way I/O is done in the current build.

On the read side, you don't really get a performance gain. You still need to rely on indexers. You don't really have horizontal scalability. If you try to start a new rollup to horizontally scale your game, you will destroy your existing smart contract ecosystem. Player-deployed marketplaces will not work with other chains you launch to horizontally scale your game. This raises a lot of questions.

Finally, the high tick rate and blocks per second is still a bit of a challenge, although we can push it as hard as we can, we may get two blocks per second, maybe three, but this is really what these blockchains can go The furthest away, because there are a bunch of things like re-marshalling that rely heavily on compute cycles.

To address this question, we look back to the early 2000s and the late 1990s, when online games like MMOs were just emerging. They have a concept called sharding. This is not a new concept; it has existed in the past. The word "sharding" we use in database architecture actually comes from a reference to Ultima Online. They were the first to use the word "shard" to explain their different servers.

So, how does sharding in games work? It's not a one-size-fits-all solution. It's a tool in the toolbox, and how you adapt it to your game will vary from case to case. For example, the first sharding construct is what I like to call location-based sharding. A good mental model is to imagine a Cartesian coordinate system divided into four quadrants, each with its own game slice. Every time you want to traverse a shard, you send a communication to another shard saying "hey, I want to move there" and you are teleported to your shard, leaving the players before you Body. By doing this, you distribute the server's workload across multiple physical instances, rather than forcing one server to do all the computation for the entire game world. The second configuration is now more popular. It's called multiverse sharding, where you have multiple game instances that mirror each other. You can choose whichever shard you want to go to, and it's load balanced by default so that each server isn't overcrowded.

Now, the key question is, how do you bring this concept to rollup? That's why we created World Engine. World Engine is our flagship infrastructure, basically an opinionated shard sorter designed for startup. Ours is different and better suited to our needs than many of the shard sorter designs we've seen in the past few discussions. The direction of our optimization is: A, throughput, B, we want to ensure that there are no locks blocking the running time to ensure that the tick rate and block time are as efficient as possible, so it is synchronous by default, the way we design the sorter is partial sorting , rather than forcing total ordering (each transaction needs to happen after the other).

The key components here are that we have two main things. We have EVM-based sharding, which is like a pure EVM chain, on which players can deploy smart contracts, combine with games, create markets with taxes, and so on. It's like a normal chain, right? Something like one block per second or something, just enough for you to do all your typical devices and markets.

The secret ingredient here is that we also use a game shard, which is essentially a mini-blockchain designed as a high-performance game server. We have a bring-your-own-implementation interface so that you can customize this shard to your liking. You can build your own shards and inject them into the base shard. You only need to implement a set of standard interfaces, just like the Cosmos you are familiar with, Cosmos has an ABC interface. You can basically pull this together into a similar specification, bringing your own shards into the World Engine stack.

The key here is that we have a high tick rate that we are currently unable to achieve with the current sharding construct. This is where I want to introduce Cardinal. Cardinal is World Engine's first game sharding implementation. It uses Entity-Component-System (ECS) with a data-oriented architecture. This allows us to parallelize the game and increase the throughput of game calculations. It has a configurable tick rate up to 20 ticks per second. For the blockchain folks here, that's 20 blocks per second.

We can also geolocate it to reduce latency. For example, you might have a sorter in the US, and then Asians have to wait 300 milliseconds for the transaction to reach the sorter. This is a huge problem in games because 300ms is a long time. If you try to play an FPS game with a 200ms lag, basically, you're dead.

Another key point that is also important to us is that it is self-indexing. We no longer need external indexers. We don't need these frameworks to cache game state. This also allows us to build more real-time games without latency issues as the indexer is still trying to catch up to the sorter blocks.

We also have a plugin system that allows people to parallelize ZK validation etc. The best part, at least for me, is that you can write your code in Go. It is no longer necessary to use Solidity to make your game work. If you've ever tried to build a blockchain game with Solidity, it's been a nightmare.

However, the key point of our shard construction is that you can build anything as a shard. They're like basically an infinite design space, like what a shard can be.

Assuming you don't like to write your game code in Go, then you can choose other ways. However, we are working on a Solidity game shard that will allow you to implement games in Solidity in a way that offers coding possibilities while retaining many of the benefits of Cardinal. You can also create an NFT minted shard with a unique mempool and ordering construct, solving the Noisy Neighbor problem similar to basic minting. You can even create a game identity shard and use NFT to represent your game identity, so that you can easily conduct game identity transactions through NFT instead of sharing private keys.

This is a high-level architecture, and I won't go into too much in-depth detail today due to time constraints. Crucially, we allow EVM smart contracts to be combined with game shards by using custom pick and pass. We created a wrapper around Geth to allow communication between them, which opens up a lot of design space in both directions. We are synchronous by default and can interoperate seamlessly and composably between shards without locks.

Our shared sorter is different in that it does not use the shared sequence construction of atomic bundles that prioritize global sorting, which requires a locking mechanism and causes problems like blocking the main thread, leading to erratic tick rates and block times, and the result is lag in the game. It also imposes limits on per-shard block times and requires various cryptoeconomics and constructs to prevent denial of service. There's also a big problem that I haven't seen mentioned in many VCR sorter constructs: if you have different shards that depend on each other and deadlock, how do you solve it? With asynchronous design, this is not a problem, because everyone does what they want to do, and then let it go.

In fact, atomic beams and roll-ups across shards are usually not necessary. For our use case, we don't need anything that requires atomic beams, nor do we think that's something we should design our Roll-Ups around use case purity. This also brings many other interesting features. For example, each game shard could have a separate DA layer for the base chain. For example, you can use the base shard to push data to Ethereum, and the game shard can push data to Celestia (similar to the data availability committee). You can also reduce the hardware requirements to run a full node, because you can run the base shard Geth full node separately, without running the game shard node, which makes it easier for you to integrate with things like Alchemy.

To sum it up, I want to be honest here that a lot of people expect their constructs to solve all their problems, but we don't. We think our construct works for us, but it might not work for your use case. It is unrealistic to assume that our constructs will work for everyone. For us, it fit our needs, offering high throughput, horizontal scalability, flexibility, and high tick rates, but it wasn't a cure for cancer. If you need a DeFi protocol that requires synchronous composability, then this construct may not be for you.

In general, I really believe in the concept of a human-centric blockchain architecture. By designing around specific user roles and use cases, you can better make tradeoffs, rather than trying to solve everyone's problems. The renaissance era has arrived, and everyone can design their own Roll-Ups to meet their own specific needs, instead of relying on a general solution. I think we should embrace the Cambrian Explosion. Don't build roll-ups like layer one with one-size-fits-all because it's not designed to solve the same problem at all. I'm personally looking forward to seeing more people explore more of the Roll-Up design space for use cases. For example, what would a Roll-Up specifically designed for asset exchange look like? Will it be intent based? What would a Roll-Up specifically designed for on-chain CLOBs (Central Limit Order Book) look like? Here, I hand over the mic to MJ. Thanks for your invitation.

English Version: