As a continuation of our “Behind the Scenes” blog series I wanted to highlight some maintenance that we’ve completed over the past two weeks.
For starters, we’ve gone through all of our switches and have applied firmware updates as necessary. For the most part these were relatively minor but some did address potential denial of service vulnerabilities. Overall this helps us to avoid problems in the future and keeping up with new updates is generally a good idea anyways. These updates went off without a hitch thanks to our ability to first test them on equivalent spares that we keep on hand. Hardware failing in some form or another is simply a fact of life and we plan for this by building in redundancy and keeping spares on hand. We make sure to have full configuration backups for all of our switches should they need to be restored. On key switches the spares are even pre-loaded with the current configuration and powered on in-cabinet such that only swapping of network cabling would be necessary to recover from a failure. As great timing would have it, our data center also performed similar maintenance this morning on their networking equipment per our scheduled network maintenance announcement. This was likewise completed without issue.
We utilize a VPN to securely connect to our internal management network for various tasks including remote management of most of the hardware that we have, from switches to servers and KVMs. If you’ve worked with IPMI enabled servers at all you’re probably well aware of all the vulnerabilities that have been found over the past year. Connecting these devices directly to the internet is a very bad idea and this is why we’ve always had them connected to a LAN that is only reachable on-site or via a secure VPN. Despite the fact that we’ve never experienced a VPN failure of any kind we opted to go ahead and deploy a second VPN server for redundancy. This, again, went off without a hitch given that it is a rather trivial addition and easily tested before deployment. The biggest immediate benefit for us is that we can perform maintenance on one of the VPN servers without shutting down VPN connectivity.
We’re not huge fans of Windows here at Dathorn as you might expect. We prefer to use various Linux distributions or OS X. There are a number of virtual machines involved as well and in the past we’ve had to keep a few Windows ones around. Despite our best efforts some things just don’t work outside of Windows. A couple of issues that come to mind are connectivity to a particular Cisco ASA firewall and some older IPMI firmwares. A lot of this is just quirkiness with Java in general but at the end of the day we do have to utilize Windows to some degree from time to time. In order to help accommodate this we’ve deployed a Windows server on our LAN which we can RDP (remote desktop) to locally or via our VPN should the need arise. The collective benefit here is that we no longer have to maintain various Windows VMs that really aren’t fun to keep up with, especially for how little they get used.
Lastly, we replaced a number of hard drives that we had to RMA with Western Digital. I will say, hands down, that the Western Digital RMA process is the best we’ve worked with and is a large part of the reason we’ll continue to use them for our mechanical drives. However, even that is coming to an end as we increasingly transition more to SSDs. Failing hard drives are just a part of life here and we plan accordingly by running various RAID levels, keeping hot spares in our servers, and keeping even more cold spares on site. More than anything this was just a purge of all the failed drives on hand since working spares were already being used. Instead of sending these off one at a time we prefer to just do a bunch of them at once when possible.
That pretty well covers a good portion of the work we’ve been doing behind the scenes here over the past couple weeks. Stay tuned for more updates as we continue this blog series!