Auction Request Downtime: What Happened & What We Will Do To Make Sure it Doesn't Happen Again

Hi Everybody,

As you are no doubt aware, we had some quite significant downtime this week. This blog post will hopefully explain the situation and what we'll do to mitigate these issues in the future.

What Happened?

On 7th September, roughly 7pm UK time, our host provider (Hostinger) began noticing a problem with the US datacenter that Auction Request is located on. We had a hardware component failure.

This resulted in the site going down and everything associated with it (everything is hosted through there, so we lost email & pings were resulting in the server hanging). After speaking with Scott (which in itself was tricky due to a lack of email not hosted via Auction Request), we found out – unfortunately – at the time – there was very little we could do, with no real way of communicating with users via official channels.

We resorted to unofficial channels where we hoped to reach a large proportion of our customers, to spread the message and to and manage expectations. I sent out a message to users of my plugin – WP eBay Product Feeds (including emailing all known active customers), and Scott put messages on a PHPBay Facebook support group.

What Are We Doing to Prevent This from Happening Again?

Obviously, we don't like this happening. In my (10+) years of working on web development systems, I believe I've only experienced one or twice this amount of downtime. Although there's very little we can do in terms of the actual downtime (we were unfortunate that our server was hit), we're looking to obviously mitigate the issue so it doesn't affect us to this extent again .

This is what we're doing:-

  • A number of off site support channels are being set up – We're first off setting up a Twitter – it's available at @AuctionRequest, as it's probably the easiest thing we can do. Whilst we'd prefer not to be using this during normal operations for support (please, contact us if you need to do this), it'll help us get the message out when there's a service disruption.
  • Decoupling of emails from the site – Obviously, having both emails and the website hosted in the same place is not great. We're looking into ways in which we can use third parties to host emails.
  • Reviewing where our DNS is hosted – we're going to take a look at a few things, linked to the decoupling of the emails. One such thing is decoupling the DNS from our main host, with a service like Cloudflare. This will allow us to restore things a bit quicker if we have a similar event (I feel I should reiterate that although we suffered downtime, no data was lost, and we take daily backups).
  • A comprehensive hosting review – Due to the original rush in getting the site live, we probably didn't spend as long as we would have liked in reviewing hosting options. Now that things are back to normal, we'll be spending some time reviewing our options moving forward.

Again, we're really sorry with what happened. I know how hard both me & Scott have been working on this project, and having this happen outside our control is frustrating. I hope that our level of support has helped shown support in what we're trying to do.  

We're happy to take questions over email, and we'll update this post as needed.

Take Care

Rhys Wynne