Age of Valor
login.php?sid=56e81f572c1a4e5b3af3493b343109c3 profile.php?mode=register&sid=56e81f572c1a4e5b3af3493b343109c3 faq.php?sid=56e81f572c1a4e5b3af3493b343109c3 memberlist.php?sid=56e81f572c1a4e5b3af3493b343109c3 search.php?sid=56e81f572c1a4e5b3af3493b343109c3 index.php?sid=56e81f572c1a4e5b3af3493b343109c3
Site Links: Home :: Features :: Connection Guide




Age of Valor Forum Index » News » Ongoing Database Issue
Post new topic  Reply to topic View previous topic :: View next topic 
Ongoing Database Issue
PostPosted: Sat Feb 17, 2018 9:29 pm Reply with quote
Red Squirrel
AoV Owner
Joined: 13 Dec 2006
Posts: 8163
Location: Ontario, Canada




Looks like the issue we've been facing for past few days may be related to a failing hard drive in one of the raid arrays. I've ordered a new drive. That particular raid array is not actually the one that the server VM is on but I think when it gets stuck it just causes the whole file server to block and then affects other stuff too. Though this is just a guess as the issue actually looks more like a network issue, so it's kinda weird. But there is a drive failing, so may as well tackle that and see if it helps. Has to be done anyway.

As a side note I do have a code issue with the SQL system as it should not be corrupting the way it's doing even with DB write issues. The system was designed in a way for situations like this not to actually cause database corruption and for the pending data to simply wait until the DB is available again. So I will have to look at that.

I am reluctant to increase the frequency of the backups for now as it will simply put more strain on the server, so I will just keep the backups at he same rate while I wait for the replacement drive to arrive. Shard DB backups run several times per day already.

If by chance you play and do something such as get an artifact and are worried about the crash happening, just shoot me a PM and I can run a backup manually.
_________________

my blog

Honk if you love Jesus, text if you want to meet Him!
View user's profile Send private message Visit poster's website MSN Messenger
PostPosted: Sat Feb 24, 2018 4:20 am Reply with quote
Red Squirrel
AoV Owner
Joined: 13 Dec 2006
Posts: 8163
Location: Ontario, Canada




The new drive came in, so I will be replacing it in next few days.
_________________

my blog

Honk if you love Jesus, text if you want to meet Him!
View user's profile Send private message Visit poster's website MSN Messenger
PostPosted: Tue Feb 27, 2018 5:31 am Reply with quote
Red Squirrel
AoV Owner
Joined: 13 Dec 2006
Posts: 8163
Location: Ontario, Canada




Drive replaced and raid rebuilt.

Will give it a few days to see whether or not that fixes it. I have a feeling it's not the cause, but we'll see. Had to be done anyway.
_________________

my blog

Honk if you love Jesus, text if you want to meet Him!
View user's profile Send private message Visit poster's website MSN Messenger
PostPosted: Mon Mar 05, 2018 2:24 am Reply with quote
Red Squirrel
AoV Owner
Joined: 13 Dec 2006
Posts: 8163
Location: Ontario, Canada




This may possibly be solved, but not too sure yet. It's related to an overall issue with my storage system where if there is too much load, things start to crash. I never was able to figure out that issue and it kind of went away on it's own but then it resurfaced and now it's hitting the DB server instead of other VMs. I figured maybe the failing drive was not helping though.

Will continue to monitor.

It's still safe to play, it's just that the worse thing that can happen is losing a couple hours of progress if it does happen. I do need to look at redesigning part of the DB system, as it should not corrupt like this regardless of if the DB becomes unavailable in the middle of saving (which is what seems to happen during high loads) so I will look at fixing that at some point.
_________________

my blog

Honk if you love Jesus, text if you want to meet Him!
View user's profile Send private message Visit poster's website MSN Messenger
PostPosted: Sat Mar 31, 2018 4:18 pm Reply with quote
Red Squirrel
AoV Owner
Joined: 13 Dec 2006
Posts: 8163
Location: Ontario, Canada




Unfortunately the drive I replaced is not the cause of this. I really don't know what it is at this point. This issue just started randomly with no explanation.

Basically it looks like the network randomly drops for no reason and then it causes the shard to crash hard instead of just trying again later to write whatever it is it's trying to write. The crash logs don't give line numbers because I think it's the core that's crashing and not the main part, so this makes it extremely hard to troubleshoot.

Given the shard is pretty much dead I'm just going to keep restoring backups for the time being every time this happens. If you log in and everything is missing assume that the issue happened and that it will get restored. I do get alerts on my phone when it crashes so chances are I already know about it when it happens, I might just be at work or sleeping or something.
_________________

my blog

Honk if you love Jesus, text if you want to meet Him!
View user's profile Send private message Visit poster's website MSN Messenger
Ongoing Database Issue
Age of Valor Forum Index » News
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
All times are GMT  
Page 1 of 1  

  
  
 Post new topic  Reply to topic  
Shout Box


Powered by phpBB © 2001-2004 phpBB Group
Designed for Trushkin.net | Themes Database

This website and forum best viewed in a standards compliant browser such as Firefox or Opera.
Internet explorer is not recommended.