from Hacker News

Make Your Own CDN with NetBSD

by jaypatelani on 9/3/24, 7:27 AM with 33 comments

  • by rahkiin on 9/4/24, 8:28 AM

    This is not a CDN: Content Delivery Network. The value is in the networking bit. Storage all around the world for both resiliency, bandwidth cost, scalability, and low latency.

    Having 1 server with some static file storage is called a web server.

  • by scrapheap on 9/4/24, 9:44 AM

    Varnish is one of those tools that has a very specific purpose (a highly configurable reverse caching proxy that is crazily fast). Most of the time I don't need it - but those places I have had to use it, it's made the difference between working services and failing services.

    One example of where it made the difference was where we had two commercial systems, let's call them System A and System B. System A was acting as front end for System B, but System A was making so many API calls to System B it was grinding it to a halt. System B's responses would only change when System A made a call to a few specific APIs - so we put Varnish between System A and System B caching the common API responses. We also set it up so that when a request was made to the handful of APIs that would change the other API's for an account, we'd invalidate all the cache entries for that one specific account. Once System A was talking to the Varnish cache the performance of both Systems drastically improved.

  • by sirn on 9/4/24, 11:29 AM

    Some comments:

    - You don't really need to repeat built-in VCLs in default.vcl. In the article, you can omit `vcl_hit`, `vcl_miss`, `vcl_purge`, `vcl_synth`, `vcl_hash`, etc. If you want to modify the behavior of built-in VCL, e.g. adding extra logs in vcl_purge, then just have `std.log` line and don't `return` (it will fall through to the built-in VCL). You can read more about built-in VCL on Varnish Developer Portal[1] and Varnish Cache documentation[2].

    - Related to the above built-in VCL comment: `vcl_recv` current lacks all the guards provided by Varnish default VCL, so it's recommended to skip the `return (hash)` line at the end, so the built-in VCL can handle invalid requests and skip caching if Cookie or Authorization header is present. You may also want to use vmod_cookie[3] to keep only cookies you care about.

    - Since Varnish is sitting behind another reverse proxy, it makes more sense to enable PROXY protocol, so client IPs are passed to Varnish as part of Proxy Protocol rather than X-Forwarded-For (so `client.ip`, etc. works). This means using `-a /var/run/varnish.sock,user=nginx,group=varnish,mode=660,PROXY`, and configuring `proxy_protocol on;` in Nginx.

    [1]: https://www.varnish-software.com/developers/tutorials/varnis...

    [2]: https://varnish-cache.org/docs/7.4/users-guide/vcl-built-in-...

    [3]: https://varnish-cache.org/docs/trunk/reference/vmod_cookie.h...

  • by daniel_iversen on 9/4/24, 7:22 AM

    I’ve heard good things about varnish and believe I used it for a few things back in the day. Squid was also good when I used it in the kid 2000s (not sure where it’s today) and I think I heard that Akamai was originally just Squid on NetBSD or something like that!! Can anyone confirm or deny?
  • by benterix on 9/4/24, 9:52 AM

    The first article in the series offers a better explanation of what and why:

    https://it-notes.dragas.net/2024/08/26/building-a-self-hoste...

  • by draga79 on 9/4/24, 9:45 AM

    This article is part of a series, and the goal is to create content caching nodes on hosts scattered around the world. When a user connects, the DNS will return the closest active host to them. On a larger scale, it's not much different from what commercial CDNs do.
  • by systems_glitch on 9/4/24, 10:56 AM

    Always nice to see a project choosing NetBSD! It's pretty easy to manage with Ansible too, so we sometimes rotate it in on "this could be any *NIX" projects and services.
  • by jhdias on 9/4/24, 9:51 AM

  • by justmarc on 9/4/24, 10:14 AM

    NetBSD is just leet. FTW.
  • by kaycey2022 on 9/5/24, 6:32 AM

    What is the point of this? Isn’t a cdn’s primary purpose to cache content close to the client?
  • by eqballhejri on 9/4/24, 10:59 AM

    Electric hybrid
  • by opentokix on 9/4/24, 8:23 AM

    Useless

    Varnish is not better in any shape or form than nginx for static content. Varnish has one single usecase, php-sites. - For everything else it will just add a layer of complexity that give no gains. And since varnish is essentially built on apache there is some issues with how it handles connections above about 50k/sec - where it gets complicated to configure, something that nginx does not have.