F&I Grid – We Are Sorry
F&I Grid Assets are Fixed! But, first, allow us to offer a very heartfelt apology. We are sorry that this mess has ever happened and that it has taken such a long time to rectify it. However, we believed that we were more than adequately backed up and could fix any corruption that occurred from the start. Unfortunately, this was not the case. An issue started in December 2021 that was corrupting the Grid’s assets. Unfortunately, the problem was not evident until over three months later. Consequently, all of our clean backups had gone.
Sara is the only server admin for the Fire And Ice Grid. When we became aware of the asset issues, she was in the final stages of her university degree. Fixing the Grid required an extended amount of time dedicated to the servers. She could not do this until the end of May without risking wasting four years doing her degree.
So once again, we offer a heartfelt apology to our residents.
The rest of this blog post will detail what our residents should expect as they return to the Fire And Ice Grid. Then It will explain what we are doing to help our residents recover from the lost assets. Finally, it will explain what happened and what we are doing to prevent it from happening again. The last section will contain more technical details than usually posted on this blog. We hope that we can address the questions of users at all levels.
What To Expect – F&I Grid Assets are Fixed
We will be opening logins for residents between the 5th of June 2022 and the 12th of June 2022. When you return, you will find that your inventory is empty, along with your friend’s lists and groups. In addition, landowners will have their regions restored with no broken assets. However, how much of a SIM we can restore varies significantly. While this sounds initially awful, please keep reading to find out how we are helping.
How We Are Helping
We have moved all our user’s original avatars to our BETA Grid. By logging into the BETA Grid, users will see who was on their friends’ lists and which groups they were in. In addition, they will be able to see the items they had in their inventory. This information can help to re-instate friendships and groups on the primary Grid. Finally, it may be possible for region owners to restore more of their regions over time.
On the primary Grid, your friends’ lists will be empty
Fixing Local Friends (Fire And Ice Grid Accounts)
Look at your friends’ list on the BETA grid and invite them to be friends on the Primary Grid.
Fixing Hypergrid Friends
All avatar UUIDs are the same as they were. Consequently, you will still get on and offline notices from Hypergrid friends even when they do not show on your Friend list. The foreign avatar must first remove you from their friend list to re-establish the friendship. After they delete the friend status, you can send a new invite.
IMPORTANT: If you travel to another grid and make the friendship there, return to Fire and Ice, then accept the friendship again after receiving the friendship abroad. If your friend comes to Fire And Ice grid to remake the friendship, they must delete the old one before coming. Then after they accept the invite at Fire and Ice, they must go back to their home grid and accept it again.
Note Worthy: If you accept multiple friendship invites while on another grid, you may only get one of them when you return to Fire And Ice. You will need to relog; you will get another invite each time you relog. If one of your friends comes to Fire and Ice and then adds multiple avatars to their friend’s list, the same will be valid for them when they return to their home grid.
On the BETA Grid, you will see all your groups. If you have made these groups, make the group again on the primary Grid and match the settings. If the group is not one you have made, ask for an invite to join the group again after its owner has remade it.
First, visit the Grid where the group is from and remove the group from your group’s list while on the other Grid. After it has been removed, ask for another invite.
Fixing Inventories – F&I Grid Assets are Fixed
Inventories are the most problematic area to restore. We can not make inventory backups from your avatar on the BETA Grid and then apply it to the primary Grid. Doing so would risk adding thousands of broken assets back into the primary Grid. Please do not ask for this, as refusal may offend. Technical details for this policy are below. Additionally, every damaged asset we restored would potentially affect every Grid user. Consequently, it is simply not possible.
Items purchased on the Kitely marketplace or similar services should be available for re-delivery. However, we know that many have refused to do so until we fixed our asset issues. If you get any objections, please refer them to this blog post or ask them to contact us directly, and we will confirm the current status with them.
There is no way to determine which items contain malformed mesh without rezzing every item in an inventory. Additionally, if those items are boxed up, they would require unpacking and rezzing. The BETA grid gives users who wish to do that work a place they can. We suspect it is simpler and quicker to replace items in many cases. However, we are aware not all things are easy to replace.
Landowners can use their own regions for this work, and other users can use the BETA grid sandbox. Landowners can then request an oar of their SIM on the BETA grid. If it’s clean, we will add this to an empty area on the primary Grid to enable them to pick their items up.
Similarly, a user without land can use the Sandbox on the BETA grid to work on their inventory until it’s clean. We will then provide an empty region to rez your items on. Once rezzed, we will move it to an area on the primary Grid for you to pick the items up again.
Thanks to Caribia Zsun, the Fire And Ice shopping mall has almost all the vendors it had previously. Those vendors were meticulously checked to ensure they were not delivering malformed mesh. This is a great place to start replacing missing items. Beyond this, enjoy the shopping experience. If you have items in an inventory on another grid, you will be able to pass those items back to your Fire And Ice avatar like usual.
Fixing Regions (Further fixes) – F&I Grid Assets are Fixed
Landowners’ original regions, including all the broken items, are available on the BETA Grid. Unfortunately, we had to be ruthless when removing items with damaged assets due to time restraints. A damaged object frequently looks perfectly ok without a detailed inspection. Broken things fall into two broad categories.
- Missing Assets
- Malformed Mesh
Missing assets may be in the prim contents, such as a script or an animation. It can also be a child prim and tiny. Additionally, those missing assets could be in the contents of a child prim. The only way to identify which bit has a missing asset is by going through the broken object prim by prim.
Malformed mesh is when the item can not be seen but is still in the region.
On request, land owners will get a list of all broken items on their Beta Grid region. Using this list, they can go through and try to fix their things on the BETA grid. This may allow you to rescue many more items. We are sorry we couldn’t do this work for you. However, the time it would have taken makes it unrealistic.
Why Erase Friends and Groups?
We have essentially re-created the entire grid from the ground up. The assets were not the only separate service. We were already aware that log out requests were not always saved in the database. As a precautionary measure, given the level of disruption, it seems prudent to ensure all issues going forwards are resolved rather than risk leaving something unknown.
Accessing The Beta Grid – F&I Grid Assets are Fixed
To access the BETA Grid, it is necessary to add the URI to the Grid’s list of your viewer. Please do not expect the same performance on the BETA Grid as the primary Grid. The server is a much lower specification. No Hypergrid travel is possible from the BETA grid, and all exports are off.
If you are not sure how to do this, please follow one of the video guides below. Please note they were made for signing up to the primary Grid. When you add the grids URI, please add the URI for the BETA grid just above.
Why Can’t Inventory items from the Beta Grid be added to the primary Grid?
The assets use a system that minimises the amount of data transferred between grids and stored on their hard drives. The system generates a unique code (HASH) when someone uploads an item. The code then becomes the file name.
When an item passes between avatars or grids, only one file is used. If a user makes a change, it stores the changes and references the original file. When an item passes from one Grid to another, the receiving grid checks to see if it already has an asset file with the same HASH name. If it does, rather than requesting the transfer, it just links the user to the one it already has saved.
For the last few months, many items that work in other grids didn’t work in Fire And Ice. This is because the file for that item already existed on our servers. So instead of getting a new copy, it used the one already saved. Unfortunately, if those files corrupt, the thing will never work.
Every user afterwards will only ever get access to the same item. If it’s broken, it’s broken for everyone, forever. Since many items are components of other items, the result can be awful. There will be four inventory items that break for every asset that breaks.
Can It Happen Again?
The asset issues of the last six months can not happen again. We have adjusted our back system to keep data for longer. Additionally, we have changed our upgrade path to ensure caches can not mask problems after future upgrades. More details on this are below.
What happened – How did it go so wrong? – F&I Grid Assets are Fixed
Anyone not interested in the technical details should stop reading here. However, if you are still reading, we assume you know how to configure Opensimulator and have some coding experience.
FS-Assets – Race condition, causing asset corruption and missing database entries
In December 2021, we upgraded the Fire and Ice Grid to Opensimulator 0.92 (the first official version of 0.92). We always have, until this point, run separate robust services for the asset, map, grid user and everything else. Unfortunately, after the December update, a race condition between the Hypergrid asset service (part of everything else) and the separated fs-asset service became apparent.
We suspect it is due to a change in Opensimulator but can not confirm this at this stage. We are sure that this configuration had worked for two years before December 2021. It is possible that the race condition was always there and got worse. Every corrupt asset file we have manually seen has had an update since December 2021.
The race condition meant that both asset services were attempting to write to the same file simultaneously. This was causing frequent crashes of one of the two services and the assets themselves to become corrupt. Also, two services were altering the fsassets table in the database. As a result, some assets are no longer in the fsassets table, and some are corrupt in the directory structure.
We do not know precisely how many corrupt assets there are; however, we know that 60k assets referred to in the inventory items table are missing in the fsasset table. Those 60k assets are affecting 250k items in the user’s inventories. This does not account for malformed meshes nor for items no longer in user inventories.
Why did the backups not help?
At the time of the upgrade, we did not clear the simulator caches to our detriment. The broken assets didn’t become apparent for many months until the caches started to expire. Many of the issues were further masked by our user’s viewer caches. Only after we asked our users to clean their inventories ready for a restore did we realise the issue was ongoing. Inventories previously cleaned were showing new missing assets. By this time, several months had passed. We keep a multi-level backup system, but after 90 days, the oldest snapshots are removed. All of our clean backups had gone before we became aware of the problem.
Mitigation Measures – F&I Grid Assets are Fixed
Single Instance Robust
Currently, Fire And Ice use a single Robust instance that encompasses all its services. While we would like to return to separating services, this is not possible until the race condition is tracked down.
New Upgrade Procedure and Cache Settings
Firstly when it comes to future Opensimulator version upgrades, we will manually clear all simulator caches to ensure they can not mask problems. Additionally, we have adjusted the expiry rate on the cache files. They now expire after a single month rather than the previous three months. Finally, we will clear our viewer caches and the simulator caches as admins. Resultingly we will see any issues much quicker.
Improved Backup Regime
Previously Fire And Ice have kept four copies of all assets and databases for three months. Each of those backups has a daily snapshot allowing us to return to any date and time in the last three months. Instead of keeping four identical copies of the daily backup, we will now split this down.
The first backup will be a daily snapshot we keep for 30 days. Next, the second will be a weekly snapshot we store for 90 days. The third will be a monthly snapshot which will keep for 12 months. Finally, the fourth will be a second copy of the daily snapshot for added peace of mind.