by leugim on 5/14/20, 7:23 PM with 15 comments
You have 2 weeks, and you will keep remaining balance after building the site.
The site will have unpredictable but reasonable traffic, let's say maximum 3k concurrent users in peak hours.
The site could be a target for script kiddies so no big DDoS attacks are expected although some little ones could.
What would you do to maximize your chances of walking off with a big chunk of cash?
by elmerfud on 5/14/20, 7:32 PM
You'd need to determine very precisely how you measure the SLA, if there's any exclusion periods (maintenance), and what the penalty is for violation. With an SLA like that the site simply being slow to load may constitute a violation.
by jtchang on 5/15/20, 4:22 AM
Then take each of the cloud providers' static hosting. I.e. s3, azure storage, google cloud storage bucket and host content there. DNS queries can be setup to round robin between all the hosts. Probably a nightmare to maintain.
by codegeek on 5/14/20, 8:19 PM
I would just get may be 3 nice dedicated servers from a provider I trust (either colocate or lease) and then slap nginx in front with may be nginx as a load balancer as well spread across the 3 servers. I can also add something like varnish cache. That way, you are reducing 3rd party dependencies and reducing the added risk (as opposed to using things like other DNS servers, cloudflare, CDN etc). For 3000 concurrent users on a static HTMl site, it should be a breeze.
But you are still still dependent on the domain registrar's nameservers and possibly your own DNS server if you setup one.
by ThePhysicist on 5/14/20, 9:42 PM
Then how do you measure uptime? Up for whom? And where does the responsibility of the company end? If someone cuts a subsea Internet cable and takes a country offline (it happened before), does that violate the SLA? Depending on where the responsibility of the company ends the problem will become much more or much less challenging.
These points aside, I think if you want to build a very stable static site the most important ingredient is redundancy. I'd argue using a single CDN or cloud provider might give you good uptime but it will not ensure your downtime stays below 2 seconds per month, as a single outage of that provider will already be enough to break the SLA (and even CDNs like Cloudflare have outages).
What you could try is the following: Get your own IP space e.g. from RIPE or APNIC. Make deals with several hosting companies to announce that address space. Use several /24 subnets, announce each one via several of those companies. Create multiple DNS A records for all these subnets that your site's domain resolves to. Set the TTL of these records as high as possible to make sure recursive resolvers have them cached (and make sure your DNS servers won't be down for longer than your TTL). Put copies of your site on servers at all the hosting companies from where you announce your IPs.
I'm not a BGP expert so I don't know how fast routes will adapt to a single data center or server being down, if you have multiple A records most browsers should try them one by one if until one works (but does that count if it takes longer than 2 seconds?), so in theory there should be a good chance of your site staying up even if one of your data centers goes down.
Then again this assumes that you have everything on the server side perfectly under control. In my experience most of the system outages are caused by human error so the chance that you will destroy your own system by e.g. misconfiguring it is higher than that one of your data centers will be taken out by a nuke ;)
We're in the process of building such an anycast network and it's absolutely doable. I'll let you know if we reach 99.9999 % uptime.
by atmosx on 5/16/20, 11:05 AM
A BGP issue might bring down the website for a country or a city or a continent for 5 hours.
That is not something you can control.
Same goes with DNS, there is caching, probably a load balancer might get updated at some point and a PHP app that has cached the DNS might miss a hit or two before invalidating.
It’s an interesting exercise.
by jamieweb on 5/14/20, 11:14 PM
by wendelmaques on 5/15/20, 2:14 AM
Because: you can route request from dns to CloudFront, then to s3 static site. And get your content cached/distributed automatic over the globe.
That’s is, super cheap and with a nice AWS SLA, and you can setup all with tls too.
by opendomain on 5/14/20, 9:07 PM
I took a site that had NO nines (less than 88% uptime) and up to 99.995 in 3 months, but that had dynamic content. Contact me - HN AT free DOT TV - I guarantee uptime
by thomassteven on 5/14/20, 7:38 PM
by Spooky23 on 5/14/20, 9:44 PM
by mycall on 5/15/20, 5:28 AM
by whb07 on 5/14/20, 10:27 PM