from Hacker News

Ask HN: Suggestions to host 10TB data with a monthly +100TB bandwidth

by magikstm on 5/27/23, 2:52 PM with 197 comments

I'm looking for suggestions to possibly replace dedicated servers that would be cost-effective considering the bandwidth.
  • by kens on 5/27/23, 5:19 PM

    That reminds me of the entertaining "I just want to serve 5 terabytes. Why is this so difficult?" video that someone made inside Google. It satirizes the difficulty of getting things done at production scale.

    https://www.youtube.com/watch?v=3t6L-FlfeaI

  • by wongarsu on 5/27/23, 3:41 PM

    Depends on what exactly you want to do with it. Hetzner has very cheap Storage boxes (10TB for $20/month with unlimited traffic) but those are closer to FTP boxes with a 10 connection limit. They are also down semi-regularly for maintenance.

    For rock-solid public hosting Cloudflare is probably a much better bet, but you're also paying 7 times the price. More than a dedicated server to host the files, but you get more on other metrics.

  • by psychphysic on 5/27/23, 5:51 PM

    I'd suggest looking into "seedboxes" which are intended for torrenting.

    I suspect the storage will be a bigger concern.

    Seedhost.eu has dedicated boxes with 8TB storage and 100TB bandwidth for €30/month. Perhaps you could have that and a lower spec one to make up the space.

    Prices are negotiable so you can always see if they can meet your needs for cheaper than two separate boxes.

  • by jedberg on 5/27/23, 5:37 PM

    It's impossible to answer this question without more information. What is the use profile of your system? How many clients, how often, what's the burst rate, what kind of reliability do you need? These all change the answer.
  • by qeternity on 5/27/23, 3:50 PM

    Hetzner auctions have a 4x 6TB server for €39.70/mo.

    Throw that in RAID10 and you'll have 12TB usable space with > 300TB bandwidth.

  • by feifan on 5/28/23, 1:45 AM

    Consider storing the data on Backblaze B2 ($0.005/GB/month) and serving content via Cloudflare (egress from B2 to Cloudflare is free through their Bandwidth Alliance).

    (No affiliation with either; just a happy customer for a tiny personal project)

  • by indigodaddy on 5/27/23, 7:44 PM

    BuyVM has been around a long time and have a good reputation. I’ve used them on and off for quite a while.

    They have very reasonably priced KVM instances with unmetered 1G (10G for long-standing customers) bandwidth that you can attach “storage slabs” up to 10TB ($5 per TB/mo). Doubt you will find better value than this for block storage.

    https://buyvm.net/block-storage-slabs/

  • by noja on 5/27/23, 4:38 PM

    To host it for what? A backup? Downloading to a single client? Millions of globally distributed clients uploading and downloading traffic? Bittorrent?
  • by wwwtyro on 5/27/23, 3:30 PM

    If it fits your model, WebTorrent[0] can offload a lot of bandwidth to peers.

    [0] https://github.com/webtorrent/webtorrent

  • by bosch_mind on 5/27/23, 3:15 PM

    Cloudflare R2 has free egress, cheap storage
  • by bombcar on 5/27/23, 3:37 PM

    That’s not even saturating a Gb/s line. Many places offer dedicated with that kind of bandwidth.
  • by andai on 5/27/23, 6:30 PM

    If it's for internal use, I have had good results with Resilio Sync (formerly BitTorrent Sync).

    It's like Dropbox except peer to peer. So it's free, limited only by your client side storage.

    The catch is it's only peer to peer (unless they added a managed option), so at least one other peer must be online for sync to take place.

  • by callamdelaney on 5/27/23, 4:41 PM

    Wasabi, unlimited bandwidth. $5.99/tb/month. Though it is object storage.

    See: https://wasabi.com/cloud-storage-pricing/#cost-estimates

    They could really do with making the bandwidth option on this calculator better.

  • by johnklos on 5/27/23, 5:46 PM

    If price is a consideration, you might consider two 10 TB hard drives on machines on two home gbps Internet connections. It's highly unlikely that both would go down at the same time, unless they were in the same area, on the same ISP.
  • by walthamstow on 5/27/23, 3:53 PM

    I would also like to ask everyone about suggestions for deep storage of personal data, media etc. 10TB with no need for access unless in case of emergency data loss. I'm currently using S3 intelligent tiering.
  • by rapjr9 on 5/27/23, 8:15 PM

    I helped run a wireless research data archive for a while. We made smaller data sets available via internet download but for the larger data sets we asked people to send us a hard drive to get a copy. Sneakernet can be faster and cheaper than using the internet. Even if you wanted to distribute 10TB of _new_ data every month, mailing hard drives would probably be faster and cheaper, unless all your customers are on Internet2 or unlimited fiber.
  • by bakugo on 5/27/23, 4:19 PM

    The answer to this question depends entirely on the details of the use case. For example, if we're talking about an HTTP server where a small number of files are more popular and are accessed significantly more frequently than most others, you can get a bunch of cheap VPS with low storage/specs but a lot of cheap bandwidth to use as cache servers to significantly reduce the bandwidth usage on your backend.
  • by ez_mmk on 5/27/23, 4:27 PM

    Had 300tb of traffic on a Hetzner server up and down no problem with much storage
  • by chaxor on 5/27/23, 11:51 PM

    I always assumed having a raspberry pi with a couple HDs in raid1 with IPFS or torrent would be the best way to do this.

    Giving another one of these raid1 rpis to a friend could make it reasonably available.

    I am very interested to know if there are good tools around this though, such as a good way to serve a filesystem (nfs-like for example) via torrent/ipfs and if the directories could be password protected in different ways, like with an ACL. That would be the revolutionary tech to replace huggingface/dockerhub, or Dropbox, etc.

    Anyone know of or are working on such tech?

  • by trhr on 5/30/23, 6:58 AM

    Hell, make me a fair offer and I'll throw it up on ye olde garage cluster. That thing has battery backup, a dedicated 5 Gbps pipe, and about 40 TB free space on Ceph. I'll even toss in free incident response if your URL fails to resolve. But it'll probably be your fault, cause I haven't needed a maintenance window on that thing in like three years.
  • by jayonsoftware on 5/27/23, 4:10 PM

    Spend some time on https://www.webhostingtalk.com/ and you will find a lot of info. For example https://www.fdcservers.net/ can give you 10TB storage and 100GB bw for around $300....but keep in mind the lower the price you pay, the lower the quality...just like any other products.
  • by tgtweak on 5/29/23, 8:52 PM

    OVH is probably your best bet and should be the cheapest both for hosting and serving the files. You'd be hard pressed to beat the value there without buying your own servers and colocating in eastern Europe.

    Most of their storage servers have 1gbps unmetered public bandwidth options and that should be sufficient to serve ~4TB per day, reliably.

  • by bicijay on 5/27/23, 3:16 PM

    Backblaze B2 + CDN on top of it
  • by scottmas on 5/27/23, 4:04 PM

    Surprised no one has said Cloudflare Pages. Might not work though depending on your requirements since there’s a max of 20,000 files of no more than 25 mb per project. But if you can fit under that, it’s basically free. If your requirements let you break it up by domain, you can split your data across multiple projects too. Latency is amazing too since all the data is on their CDN.
  • by jacob019 on 5/28/23, 1:10 AM

    Smaller VPS providers are a good value for this. I'm currently using ServaRICA for a 2TB box, $7/mo. I use it for some hosting, but mostly for incremental ZFS backups. Storage speed isn't amazing, but it suits my use case.

    I'm using cloudflare R2 for a couple hundred GB, where I needed something faster.

  • by bullen on 5/28/23, 6:16 AM

    I think 2x 1 Gb/s symmetric home fibers + SuperMicro 12x SATA Atom Mini-ITX with Samsung drives can solve this fairly cheaply and durably depending on write intensity.

    That said above 80 TB is looking hard for eternity, unless you can provide backup power and endure noise of spinning drives.

  • by ignoramous on 5/27/23, 11:31 PM

    One way I can think of is to serve files over BitTorrent with a (HTTP) web seed stored across blob stores at Cloudflare R2, Backblaze B2, Wasabi.

    I briefly looked at services selling storage on FileCoin / IPFS and Chia, but couldn't find anything that inspired confidence.

  • by justinclift on 5/27/23, 3:56 PM

    Any idea on how many files/objects, and how often they change?

    Also, any idea on the number of users (both average, and peak) you'd expect to be downloading at once?

    Does latency of their downloads matter? eg do downloads need to start quickly like a CDN, or as long as they work is good enough?

  • by roetlich on 5/27/23, 6:41 PM

    As a previous employee there I'm very biased, but I think bunny.net has pretty good pricing :)
  • by cheeseprocedure on 5/27/23, 9:32 PM

    Datapacket has a number of dedicated server configurations in various locations and offers unmetered connections:

    https://www.datapacket.com/pricing

  • by j45 on 5/27/23, 8:38 PM

    The OP request would benefit from details, but the solution depends on what format the data is and how to be shared.

    Assuming the simplest need is making files available :

    1) Sync.com provides unlimited hosting and file sharing from it.

    Sync is a decent Dropbox replacement with a few more bells and whistles.

    2) BackBlaze business let’s you deliver files for free via their CDN. $5/TB per month storage plus free egress via their CDN.

    https://www.backblaze.com/b2/solutions/developers.html

    Backblaze seems to be 70-80% cheaper than S3 as it claims.

    Traditional best practice cloud paths are optimized to be a best practice to generate profit for the cloud provider.

    Luckily it’s nice to rarely be alone or the first to have a need.

  • by ericlewis on 5/27/23, 7:15 PM

    Cloudflare R2, egress is free. Storing 10TB would be about $150 a month.
  • by mindcrash on 5/27/23, 9:10 PM

    Several huge infrastructure providers offer decent VPS servers and bare metal with free bandwidth for pretty reasonable prices nowadays.

    You might want to check out OVH or - like mentioned before - Hetzner.

  • by api on 5/28/23, 12:49 AM

    Bare metal hosting with bandwidth priced by pipe size rather than gigabyte. Check out DataPacket, FDCServers, Reliablesite, Hivelocity, etc.

    Cloud bandwidth is absolutely enormously overpriced.

  • by FBISurveillance on 5/27/23, 3:16 PM

    Hetzner would work too.
  • by sacnoradhq on 5/27/23, 8:19 PM

    Personally, at home, I have ~600 TiB and 2 Gbps without a data cap.

    I can't justify colo unless I can get 10U for $300/month with 2kW of PDU, 1500 kWh, and 1 GbE uncapped.

  • by winrid on 5/27/23, 7:11 PM

    You could do this for about $1k/mo with Linode and Wasabi.

    For FastComments we store assets in Wasabi and have services in Linode that act as an in-memory+on disk LRU cache.

    We have terabytes of data but only pay $6/mo for Wasabi, because the cache hit ratio is high and Wasabi doesn't charge for egress until your egress is more than your storage or something like that.

    The rest of the cost is egress on Linode.

    The nice thing about this is we gets lots of storage and downloads are fairly fast - most assets are served from memory in userspace.

    Following thread to look for even cheaper options without using cloudflare lol

  • by k8sToGo on 5/27/23, 6:42 PM

    People here suggest Hetzner. Just be aware that their routing is maybe not as good as you get for more expensive bandwidth.
  • by pierat on 5/27/23, 6:12 PM

    Sounds like you could find someone with a 1Gbps symmetric fiber net connection, and pay them for it and colo. I have 1Gbps and push that bandwidth every month. You know, for yar har har.

    And that's only 309Mbits/s (or 39MB/s).

    And a used refurbished server you can easily get loads or ram, cores out the wazoo, and dozens of TB's for under $1000. You'll need a rack, router, switch, and batt backup. Shouldn't cost much more than $2000 for this.

  • by jstx1 on 5/27/23, 4:18 PM

    Is there a reason people aren’t suggesting some cloud object storage service like S3, GCS or Azure storage?
  • by KaiserPro on 5/27/23, 8:51 PM

    whats your budget?

    who are you serving it to?

    how often does the data change?

    is it read only?

    What are you optimising for, speed, cost or availability? (pick two)

  • by dark-star on 5/27/23, 6:26 PM

    you can definitely do this at home on the cheap. As long as you have a decent internet connection, that is ;) 10TB+ harddisks are not expensive, you can put them in an old enclosure together with a small industrial or NUC PC in your basement
  • by jszymborski on 5/27/23, 3:16 PM

    Wasabi.com is usually a good bet when your primary cost is bandwidth.
  • by risyachka on 5/27/23, 5:37 PM

    Find a bare metal server with 1GBit connection and you are all set.
  • by snihalani on 5/27/23, 9:58 PM

    Cloudflare R2 or Oracle Cloud Infra or Hetzner
  • by bluedino on 5/27/23, 7:31 PM

    What are you using now and what does it cost?
  • by influx on 5/27/23, 10:38 PM

    I would do it in S3 + Cloudfront if your budget supported it. Mostly because I don't want to maintain a server somewhere.
  • by sacnoradhq on 5/27/23, 8:14 PM

    Backblaze, MEGA, or S3 RRS.
  • by pollux1997 on 5/27/23, 8:09 PM

    Copy the data to disk and ship them to the place they need to be used.
  • by rozenmd on 5/27/23, 5:23 PM

    Cloudflare R2?
  • by wahnfrieden on 5/27/23, 9:28 PM

    bittorrent and some home servers
  • by kokizzu5 on 5/28/23, 2:45 AM

    contabo
  • by subhro on 5/27/23, 8:13 PM

    tarsnap
  • by delduca on 5/27/23, 8:52 PM

    I would go with BunnyCDN at front of some storage.
  • by aborsy on 5/27/23, 7:09 PM

    S3 is probably the highest quality. It’s enterprise grade : fast, secure with a lot of tiers and controls.

    If you recover only small data, it’s also not expensive. The only problem is if you recover large data. That would be a major problem.