Restart improvement – Fire and Ice Grid. Today we announce an upgrade to the grid. Inworld residents get a warning 20 mins before the region they are on restarts. They will receive additional messages every min until the twenty mins run out. Residents getting enough time to logout improves their experience and allows them to use another area of the grid while the simulator restarts.
Automating the process allows us to minimise downtime when a server restarts. Intervention by admins is unnecessary, the grid will restart with the server. If a server restarts at an unexpected time, the grid handles it, no need to wait for an admin.
Every other day we take an OAR backup as described in Grid backups and oar files. The automatic backup module in Opensimulator. Opensim uses the startup time of the simulator to measure the time between backups. If multiple simulators backup at one time, it causes a resource bottle-neck. When a server starts up, all regions start up immediately. Fifteen minutes later, rolling restarts begin. Between each restart, there is a thirty-minute wait. The wait spreads out the OAR backups, avoiding any bottle-neck.
Inworld notices use the ‘alert’ command in the console. A combination of systemd, tmux and bash shell scripting automates everything. Full details are available in Automated Opensim Startup And Shutdown.
Restart Improvement – Fire And Ice Grid – What comes next.
Currently in the pipeline is a keep-alive system. When complete, if a simulator or robust crash, it will be detected and fixed automatically.
Did you write this post before?
We did publish these details previously, however the blog post was the only loss from the attack on our servers. We will not be putting this post out on social media, just here as a replacement.