from Hacker News

A Surprise AWS Bill

by oaf357 on 7/26/20, 12:40 PM with 312 comments

  • by stefan_ on 7/26/20, 2:25 PM

    That's a whole lot of text for "I had a 14 GiB VM image publicly linked and people discovered it", and none of it has much to do with AWS or Cloudflare.

    Presumably the author would do much better with a VM or something from OVH, they'll just shut you off or limit you before it becomes a problem (not that they would care about 30 TiB).

  • by dahdum on 7/26/20, 2:14 PM

    From the article:

    > Cloudflare was the least helpful service I could have imagined given the circumstances. A long term user and on and off customer thinks they were attacked for two days and you don’t lift a finger?

    > File this under, “Things I should’ve known but didn’t.” Did you know that “The maximum file size Cloudflare’s CDN caches is 512MB for Free, Pro, and Business customers and 5GB for Enterprise customers.” That’s right, Cloudflare saw requests for a 13.7 GB file and sent them straight to origin every time BY DESIGN.

    I don't really see how Cloudflare has much blame here. He's an "on and off customer" which I'm guessing means currently "off". They only cache a limited number of file extensions (qcow2 isn't one of them), and it's all documented.

    AWS always seems pretty generous in resolving these cases at least.

  • by ta1234567890 on 7/26/20, 2:18 PM

    Off topic but on the same line. One of the most annoying things about being a consumer in the US is the ubiquitous unknown-until-last-minute pricing.

    You rent a car, you don't know what the total is going to be. You go to the hospital, you don't know how much you're going to have to pay. You book a hotel and don't know the total until you check out. You go to a restaurant and even if you order just one thing and saw the exact price on the menu, that's not going to be the total. You go to the grocery store, see all the prices on the items, add them up, and then when you go pay, surprise!

  • by clarkevans on 7/26/20, 2:36 PM

    > Now that I’m aware of the 512 MB file limit at Cloudflare, I am moving other larger files in that bucket to archive.org for now (and will add them to my supported Causes).

    ...

    > I don’t feel like archive.org should be my site’s dumping ground since it can turn a profit if it gets popular. archive.org is a stop-gap for two files for the time being.

    I'm trying to understand... he has decided to burden a charity with his distribution expenses?

  • by HugThem on 7/26/20, 2:21 PM

    Summary:

    He published a 14GB file and one day there were 2700 downloads resulting in ~30 Terrabyte of traffic.

    He had the file behind CloudFlare, but since CloudFlare does not cache files larger then 512MB, all the traffic went to his S3 bucket and Amazon billed him $2700 for that.

  • by RKearney on 7/26/20, 2:04 PM

    CloudFlare's TOS is clear on using the service to serve up 13.7GB files.

    https://www.cloudflare.com/terms/

    2.8 Limitation on Serving Non-HTML Content

    [...] Use of the Service for serving video (unless purchased separately as a Paid Service) or a disproportionate percentage of pictures, audio files, or other non-HTML content, is prohibited.

    So 500MB limit or not, the author is already violating CloudFlare's terms of service.

  • by saddlerustle on 7/26/20, 2:31 PM

    Amazon gets away with high bandwidth pricing because almost all their customers are businesses with high revenue per byte served. If you want to serve large assets economically you have to look elsewhere.

    Bandwidth on Oracle Cloud is $0.0085/GB with the first 10TB free each month, so this would have cost only $170. Alternatively bandwidth on Backblaze B2 costs $0.01/GB, but is free out to Cloudflare, so this traffic would have been completely free.

  • by schmichael on 7/26/20, 5:25 PM

    The fact that the cloud allows hobbyists, small businesses, and massive enterprises ~equal access to services is amazing.

    However it means sometimes things like this happen where a product’s incentives (serve any content at any cost) are wildly misaligned with a huge percentage of users needs (I’d rather my site, or preferably just the costly resource, be down than pay $2k).

    There’s endless tuning non-enterprises can do to get our ideal behavior: but that’s the difference between pre-cloud and post-cloud computing. It used to take monumental effort to build high scale high availability systems. Your $5/mo Dreamhost site would just die under load instead of charging you thousands. Now enterprise use cases are supported by default and it takes careful tuning to opt out.

  • by wonderlg on 7/26/20, 4:36 PM

    I think the main problem is that there isn’t an easy way to set a hard monthly limit on these services.

    I use a bunch of “freemium” services like S3 and Google Maps API and I’ve never paid a penny. I use them because they don’t cost a penny for my very limited usage, but I’m not looking forward to the day I mistakenly and disastrously exceed their free tier.

  • by jrott on 7/26/20, 2:06 PM

    This is the nightmare scenario with a personal AWS account. AWS billing setup makes it impossible to know that there is a giant bill until after the fact. I wish there was some way to limit the bill and just have everything shut down at that point for hobby projects.
  • by social_quotient on 7/26/20, 2:06 PM

    AWS is an amazing service and in our past been very accommodating for surprises as long as it seems we know what we are doing and are going to mitigate the cause. I’ve rarely seen this allegiance to the “customer” by an enterprise company. They really do care and they figured out the recipe to make caring scale.

    What’s odd is the touch points are cold. Ticket system support, phone call back etc. it feels like it’s going to be robotic canned replies but they figured out a way to make the people on the other side smart enough to understand the issue, empowered enough to do something about it, empathetic enough to want to resolve things “fairly”.

  • by devwastaken on 7/26/20, 2:11 PM

    Cloud services need to have cost caps, plain and simple. This isn't cloudlflares fault, it's Amazon's, and it's the authors. Cloudflare should be detecting overall data transfer, but there's plenty of cases where terabytes of traffic is entirely expected. We know Amazon won't fix their service, so perhaps cloudflare could impliment bandwidth limits.
  • by hasenpfote on 7/26/20, 4:58 PM

    I have said it over and over and will repeat it happily:

    IF your Services doesn't has a proper limit, you do make yourself suddenly liable to a much higher risk than before and you have to be aware of this.

    It is the same shit when you rent a car: Do NEVER rent a Car without proper insurance.

    I'm working with GCP professionally and i have used AWS in the prev company. I do ask my manager if i can use it to try a few things out and its fine but i will not put my credit card behind an account with unlimited cost risk (its limited probably but you know what i mean).

    And its not even simple; Everything costs you money. Storing data, receiving data, pushing data, making api requests etc.

    And what i find always quite surprising: How often people, even on hn, present simple file based apis where you can upload images and edit them or upload files and download them again or offering free services and that with AWS as a backend.

    I just might be to long in this industry to see all those pitfalls of exploits and risks everywhere but i have the feeling that obvious respect against cloud service billing is neglected by most.

  • by hoppla on 7/26/20, 7:14 PM

    When I was first looking at aws, I received a billing prediction alert that was ridiculous high. I could not find the culprit (I only had an ec2 instance and some other random services I looked into). In the end I deleted my account, to avoid this unexplainable billing. Next day, I got an email saying I erroneously got the alert. Damage was already done, but at that time, I was only exploring what the cloud had to offer, so no real damage done.... until a couple years later when I had to do some real aws for my work. Because I deleted/disabled my account, I cannot reopen an account with the same email address.
  • by rytrix on 7/26/20, 2:41 PM

    Everything is always crystal clear in hindsight, that being said I always tell my clients to set up billing alarms as one of their first tasks when getting started on AWS. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitori...
  • by prepend on 7/26/20, 2:05 PM

    It’s always frustrated me how it’s not possible to set a quota and turn off services at quota.

    Logistically I know this is hard for water or power, but it should be feasible for cloud computing. But I think this is an area where it’s not in AWS’ interest to set up that kind of billing control.

  • by gchamonlive on 7/26/20, 2:35 PM

    I think it is underestimated how complicated it is to deal with cloud services. You can do a lot with minimal training and doc reading. But these wholes in the formal understanding of how everything operates creates these kinds of vulnerabilities.

    Everything can have side effects in the cloud. You can set up a cheap EC2 type T feet, and without managing your cpu usage, be charged a fair amount in unlimited burst credits (which is the default for terraform for instance).

    You can quickly setup a WordPress instance with cloudfront and a invalidation Plug-in and be charged 6000 USD unadvertedly (https://wordpress.org/support/topic/amazon-cloudfront-invali...)

    You can set up lambda triggers and quickly do a proof of concept for an app, but forget to correctly dimension your mem usage and be charged more than you need.

    Cloud requires careful policy and topology consideration. There are many simple blocks that forms a complex mesh with opaque observability of potential vulnerabilities in both access and billing. Cloud is nice but it requires time and care. And with the shared responsibility model, you are responsible for that.

  • by ryanmarsh on 7/26/20, 2:52 PM

    Saw the mention of PTSD and running towards danger and immediately thought, “oh nice hopefully I’ve found a kindred spirit”. This matters because I share little in common with my peers in this field. So I read Chris’ About page and...

    Do any other (combat) veterans smell something wrong with an Air Force Tech Controller (3C2X1) making statements like ”like back in the old days, when something would go bang or boom, and I’d run towards it” in a civilian venue? You know exactly what I mean, and we see it all the time.

    If you aren’t a veteran, especially with a job even remotely related to “running towards things that go boom” please just give us some space on this one. Thanks.

  • by namidark on 7/26/20, 6:26 PM

    AWS bills are un-auditable. I'm convinced every org is being over-charged for bugs in their billing and tracking software. I've asked on multiple occasions where charges randomly started appearing (despite no infra changes), which weren't there the month before, and no one was able to answer on the AWS support side.
  • by knorker on 7/26/20, 6:31 PM

    Summary:

    1. The big cloud providers charge enormously for outgoing bandwidth. Most of us know this, but unfortunately it bites people a lot. 2. If you host big files on these clouds with no limits or warnings, it's just a matter of time before this happens to you.

    This is why I don't run hobby things on these clouds. Any hobby project may have backends and services running on them, but NEVER anything user-accessible such a webserver, S3/GCS bucket, or similar. It's just too much of a "click here to bankrupt me".

    For a business it's a different matter. You are making money, and you're spending money to do so. You still need to have a DDoS plan for your outgoing traffic, but it's much easier to solve these problems if you have revenue. Revenue buys time and people.

  • by sunilkumarc on 7/26/20, 3:10 PM

    Nicely written article. Interesting and one should definitely be aware of how certain services are charged before using them. This would be a good lesson for all of us :)

    On a different note, Recently I was looking to learn AWS concepts through online courses. After so much of research I finally found this e-book on Gumroad which is written by Daniel Vassallo who has worked in AWS team for 10+ years. I found this e-book very helpful as a beginner.

    This book covers most of the topics that you need to learn to get started:

    If someone is interested, here's the link: https://gumroad.com/a/238777459/MsVlG

    I highly recommend buying this e-book if you think AWS documentation is overwhelming.

  • by Ciantic on 7/26/20, 6:12 PM

    This is the reason I liked Azure as I used it few years back. One could set like a prepaid plan, and there was no way to overspend it accidentally.

    Of course it is not ideal for companies who need their services be available for all cost, but for home users it's a nice guarantee.

  • by rubenhak on 7/29/20, 1:05 AM

    Not directly related to S3 traffic bill, but overall cloud cost management. Maybe some are unintentional, but still very painful. My experience with AWS & GCP.

    - AWS CloudWatch: expensive service, virtually unusable, hard to turn it off.

    - AWS overall: finding and cleaning up resources is messy. The order of creating & cleanup is not same. Closing an account is a painful process. GCP Project structure is way easier.

    - AWS EKS: You create a cluster, then a node group. Deleting a cluster fails if there is a node group. You go ahead to delete a node group, it complains because of "dependencies". While you're randomly looking for a "dependency" the $ clock is still ticking. You should delete the network interface before you could delete the node group, and only then the cluster. This does not sense because if the network interface was created implicitly by the node group, i should not be responsible for deleting the network interface. There should be a symmetry in create/delete operations.

    - GCP GKE: You create a cluster, then delete it. Cluster gets deleted - kudos, usability much better then with AWS EKS. But it turns out lots of LoadBalancers and Firewall rules are left over and still appear on the cloud bill. Those are implicitly created and should be cleaned up implicitly by GKE.

  • by ricardo81 on 7/26/20, 2:21 PM

    Maybe missed it, but I didn't see which plan he was using with Cloudflare, in context of his comments about their support.
  • by BrandoElFollito on 7/26/20, 9:02 PM

    I asked AWS to set a limit on my spending. They said that they did not want to do that "not to break my business".

    I want them to - I do not care if my site is offline vs. having to pay a huge bill. That should be a choice.

    So I moved away from AWS. It is crazy that companies agree to such a racket (not the pricing - but the fact that you cannot set a limit).

    I considered to use a virtual card with a limit on it - they could not grab more than the limit and just sue me across the pound or remove my account. But I refuse to play these games with a company who does not give a shit about billing.

  • by logicallee on 7/26/20, 7:43 PM

    I think this is a very common occurrence!

    A good alternative to this ever-present risk is to use a dedicated virtual private server that is unmetered. This would make mistakes like this (and yes, it is a mistake - it is his fault he didn't read the cloudfare details and publicly served a large VM image) impossible.

    Here is my referral code for the one I use[1]:

    https://crm.vpscheap.net/aff.php?aff=15

    This also (especially) applies to startups that might suddenly take off at any moment (but don't expect to.) AWS is a ticking time bomb of unexpected charges. You never know what the Internet will bring you. Go for an unmetered VPS and have 1 single well-defined charge that doesn't change. That's what I do on my side projects.

    [1] I previously asked Dan, the moderator here, if I can share in this way and he said it's okay. I don't have other affiliation with that company and have found it good. The last time I posted this I got 80 visitors and no complaints (and got upvotes), so I figure it is a good resource for people.

  • by SergeAx on 7/26/20, 10:42 PM

    > Moving it back to AWS from GCP bumped the AWS bill to an average of $23/month. Not too bad given the site’s traffic.

    I've checked the traffic, it was 2.3k users for entire June, like 75 user per day at average. It is effectively nothing, why author thinks it's okay to pay 1 cent per user per month to hosting provider? $5/mo VPS can handle two orders of magnitude more.

  • by Artur96 on 7/26/20, 4:45 PM

    This is why billing alerts are mega important to setup
  • by tgsovlerkhgsel on 7/26/20, 9:35 PM

    This is why I'm deathly afraid of using any major cloud provider.

    External traffic is effectively unlimited, and a number of possible reasons (popularity, misconfigured script pulling something in a loop, someone intentionally generating traffic to hurt me) have the possibility to throw me into arbitrary amounts of debt, with the only recourse being hope that the cloud provider will be merciful.

    Even if I have alerts set up: someone pulling 10 Gbit/s can generate over 100 TB per day, at $80-100 per TB. If I don't check my e-mails for weekend, I can be $30k in the hole before I notice.

  • by Edd314159 on 7/26/20, 4:12 PM

    I racked up a $120K+ Google Cloud bill via an unsupervised and poorly-coded script which used the geo coding API. It didn’t take much to get Google to waive it. This happens all the time I’m sure
  • by nix23 on 7/27/20, 9:55 AM

    For a private person it's much better to use something like a private vm or a dedicated server, from Vultr or 1984hosting (Iceland) you can get a vm for just 2.50$ (only IPv6) or 5$ (IPv4 and 6) or a dedicated server from Hetzner, OVH, Scaleway (Arm64) for like 30$, some have unlimited traffic (mostly the dedicated servers) with a 1Gbit connection. NEVER use stuff you have to pay without knowing whats coming (count's for private use an small business)
  • by ColdHeat on 7/26/20, 4:18 PM

    DigitalOcean's S3 offering probably would've kept this bill down a bit. Probably about $300 versus $2000.

    https://www.digitalocean.com/docs/spaces/#bandwidth

    Digital Ocean may not be the best cloud platform but it's fairly cost effective.

  • by tjoff on 7/26/20, 7:29 PM

    Can someone argue that the complexity of the cloud does not easily surpass the time and effort of just setting it up yourself? And for once all of that time and effort into setting it up is actually valuable and you get much better insights into your own operations.

    And the alternative is paying someone to lock you into their ecosystem.

    Are we really that lazy?

  • by vmception on 7/26/20, 8:01 PM

    I had a $190,000 AWS bill for an account only used for static S3 hosting.

    And guess what, I didn't write a blog post about it. I just went to support, said remove the charges, they identified the services that created the issue so I could kill them, and they removed the charge.

    Look at that, no fan fare. I had no emotion about it whatsoever. Maturity.

  • by pavelevst on 7/26/20, 9:55 PM

    Good reminder to everyone including myself to use services that have budget alerts or spike protection. And to stay away from aws, even 23$ for static website hosting is a bit too much in 2020. At fairly priced hosting without fancy name, 30tb traffic can cost <10$
  • by reedwolf on 7/26/20, 2:44 PM

  • by Borlands on 7/26/20, 8:45 PM

    Copying a 12GB file and sharing it publicly on S3 (how does AWS make money with S3, anyone care to answer?). I agree it’s expensive, maybe outreagously expensive. It’s great he got a refund! Not so many would be that lucky. And we can all learn
  • by zxcvbn4038 on 7/26/20, 3:39 PM

    The article shows again the importance of reaching out to AWS support for issues like this, they would rather have a long term customer then a one time score, they are really forgiving of one time mistakes.
  • by dvfjsdhgfv on 7/26/20, 9:34 PM

    > Praise Twitter for at least its ability to draw attention to things. I am not sure this would’ve ended up as well as it did without it.

    This is another bad aspect of these stories.

  • by afterwalk on 7/26/20, 3:59 PM

    I wonder if aws would have given the same refund if the author wasn’t a cloud evangelist with a twitter following highly relevant to aws.
  • by Sphax on 7/26/20, 2:46 PM

    Am I reading this wrong or the site's traffic is almost always under 1000 page views per day ?

    Why would you need AWS or Cloudflare to serve that ?

  • by chmod775 on 7/26/20, 11:58 PM

    It would've literally been cheaper to burn that file to a few thousand DVDs and mail them to individual people.

    Nice pricing AWS.

  • by voltagex_ on 7/27/20, 1:00 AM

    So what do you do if you want to host a file on the Internet but don't have $2000USD kicking around?
  • by SeriousM on 7/26/20, 7:04 PM

    The title is a clickbait.
  • by aaronchall on 7/27/20, 1:17 AM

    How's Linode on this sort of thing?
  • by 9nGQluzmnq3M on 7/26/20, 2:07 PM

    TL;DR: Don't leave 10+ GB VM images open to the world on S3 unless you want pay everybody's bandwidth bills when they spin up a new instance using them. And set up billing alerts!
  • by jabo on 7/26/20, 4:32 PM

    > The primary motivation was that Google had so intertwined GSuite and GCP IAM that it became overly confusing.

    Glad I’m not the only one confused by this.

  • by johnklos on 7/26/20, 6:49 PM

    Does Amazon not have... logging?