Technical Update : April 15, 2018
Let’s get technical, technical. (ahem) Excuse me, got carried away. With Radical Heights being in X-TREME Early Access, the Boss Key team was prepared to shake hands with new issues that may arise once we released the game into the hands of players. Closed tests were put in place to try to sniff these out before the game’s release on Tuesday, April 10, but going from a small pool to an ocean of players can sometimes introduce new obstacles for any game.
The team worked nearly around the clock to put the sleeper hold on some issues that arose and in continuing with transparency in Early Access development, one of our Rad engineers wanted to share some information with you about technical issues the team ran into and what fixes were applied.
Now your host for this article, Boss Key Technical Director, Romain Dura!!!!! Take it away, Romain (hands microphone).
Launching a multiplayer game and keeping everything online and stable can be quite challenging, especially with a small team. We have launched Radical Heights in X-Treme Early Access five days ago, and wanted to be transparent about the various technical issues we've run into.
The biggest problem that players have experienced is the Infinite Loading Screen of Death. This was due to several underlying issues.
The matchmaking server is in charge of grouping players together and telling the game client which game server to connect to to start a match. After leaving a match, players could get stuck when they queued again. The first issue discovered was that the game client was sometimes sending a message to matchmaking indicating a network disconnection instead of an event to deliberately leave a match from the Pause menu. Matchmaking was configured to preserve player slots in games, in case they crashed or got briefly disconnected and tried to rejoin. Upon finding a match again, the matchmaking server responded to clients to connect to the same game server that they were just connected to, as if they were trying to rejoin after a disconnect. The game server was then kicking players because they had already died in that match, and the game became stuck on the loading screen instead of returning to the main menu.
To avoid this behavior without having to patch the game with around 8000 concurrent live players on the second day, we attempted to disable the functionality to preserve disconnected player slots in matches, by toggling a live configuration switch. Unfortunately, this configuration hadn't been widely tested under load, and caused all ending matches to never reset properly, leading to a starvation of the game servers, and thousands of players stuck in matchmaking queue for about 30 minutes (starting at A on the chart).
We reverted the switch change, but that did not release all stuck game servers. At this point we had to fully reset the matchmaking server and all game servers. This led to the loss of the Play button in the menu for a few minutes, and things returned to normal after that (B on the chart).
The next attempt was to force matchmaking to treat all match disconnection events as deliberate, via a code change. We deployed the matchmaking server update, which was successful in avoiding the bug where players were sent again to their previous match server.
Unfortunately, there were still reports from players being stuck on the loading screen. Deep sigh.
Despite the previous fix, there was still an issue where players wouldn't be removed from their match on the matchmaking server if they appeared to still be connected to their game server. This too, could make the matchmaking server force clients to reconnect to their previous server. The source of the problem was a race condition where the network message indicating that the player had disconnected from the game server came after the event to deliberately leave the match, instead of before. This happened because of a misconfiguration of a built-in delay to handle this specific case. We corrected the live delay configuration, and this problem was resolved.
UNFORTUNATELY, later during the day we were made aware of more reports from players stuck on the loading screen. DEEP sigh.
It was then discovered that some clients were assigned a correct server, but did not appear to connected to it. This particular server was one of the new game servers that we added to meet the increasing number of concurrent players. We found that one of the new game servers had been setup with a firewall configuration rule that blocked all network traffic, instead of allowing specific game ports. This prevented players from actually connecting to the assigned game server, and loading into the match, staying permanently stuck on the loading screen. This has affected thousands of players in the afternoon of Thursday April 12th.
We corrected the firewall configuration rule at 6:10pm EST, and players could load into the game.
Steam, The Next To Last loading bug
We received more reports of players stuck on the loading screen, AND some players were unable to login into the game at all. There are no words.
We quickly discovered the source of the issue was a wide scale Steam maintenance affecting many of their services including authentication for clients and game servers. The Steam services came back online within several minutes.
MOAR issues, omg
After all this, we still receive reports of players stuck on the loading screen. Dark times.
We identified that some servers in the OCE (Sydney) and SEA (Singapore) regions, were stuck awaiting for map travel, an operation that happens at the end of a match to reset the game state. This was due to a combination of factors. Some game servers seem to be frequently disconnected from the matchmaking server while reloading the map at the end of a match, and on top of that, the matchmaking server had an incorrect configuration where it waited for a confirmation that the map reload was complete, in order to let clients join it. This happened specifically when a match ended with 8 connected players (players in games and spectators). We corrected the configuration to always force a full reset of the match state on the matchmaking server.
As of Saturday. April 14th, 05:00pm EST, we haven't received any new reports of players stuck on the loading screen.
Thanks for your patience and support everyone!
Technical Director, Boss Key Productions
Thanks, Romain, for providing our contestants with some technical transparency! The team is blown away by the support and number of contestants that have joined us during launch week. We are very excited to see what the future has in store. The team is continuing to work on more fun improvements for The Dome and we will be sure to share more with you soon. Be sure to follow us on social to stay up-to-date on news and announcements about the game.
Go forth and reach for those Radical Heights!