Atomx Ad-serving Partial Outage | On 6 April 2017 at 15:20 UTC Atomx ad-serving began to experience intermittent failure. This was noticed immediately and engineers began analysing the problem and restarting the affected services. While restarting servers to keep things working it became apparent that the VPN connection between our cloud providers Google Cloud and Amazon was showing intermittent failure. After disabling most SSPs to lighten the load on the connection it was decided that the fastest way to recover completely was to switch all critical infrastructure to one cloud provider bypassing the VPN. On 7 April 2017 at 01:00am UTC the last service switch was made an ad-serving is now working smooth again.
We are still talking to Amazon and Google to diagnose the issue with the VPN, in order to avoid similar problems in the future. The Atomx infrastructure is spread between Google Cloud and Amazon. We use a VPN to combine the networks of these two cloud providers.