from Hacker News

Adblock via /etc/hosts

by lpsz on 2/12/16, 2:44 AM with 139 comments

  • by sugarfactory on 2/12/16, 8:20 AM

    I had been doing this until some time ago to block ads and to prevent Google from collecting my web browsing history via Google Analytics. During the time I witnessed a strange phenomenon. Every time I added "127.0.0.1 www.google-analytics.com" to C:\Windows\System32\Drivers\etc\hosts. I saw the line removed from the file some hours later. Although I had added tens of lines I only saw the Google Analytics line removed. IIRC finally I decided to figure out whet caused the removal. I used Filemon to watch file changes, though the line got removed again while watching the file and nothing appeared on the log. I suspected Ring-0 processes were secretly running and causing the removal, but I knew nothing about the Windows kernel so I gave up here. I wonder what was the cause to this day.
  • by sergiotapia on 2/12/16, 4:00 AM

    See, the thing is what if a website is broken due to host files, you can't easily re-enable ads for just this one website you need.

    A situation we can all imagine ourselves in: You need to check the google analytics for your website/company site. You can't because it's blocked at Host level.

    What solution would there be for this use case?

  • by stblack on 2/12/16, 1:43 PM

    Hi Folks, this is my repo, thanks for all the comments.

    I'm always looking for ways to improve things so I'm open to all suggestions.

    EDIT: A couple of clarifications.

    1) This isn't just for adblock. Your hosts file is useful for thwarting all sorts of malware. If a bot or trojan phones home with a domain, a vigilant hosts file will block it. A if a bot or trojan phones home with an IP, then the hosts file can't help you but, then again, an IP can be physically located fairly quickly.

    2) The key to a good hosts file is keeping it current. This hosts file amalgamates several well-curated sources. So your hosts file is only as good as your ability to keep it current. This repo helps with this.

  • by baptistem on 2/12/16, 8:06 AM

    I'm running this for years.

    using 127.0.0.1, I have a httpd responding to every request by a 200. this avoid some anti-ad-block check. (such as "watch this ad before your video")

    you can also configure your server to reply with a cat gif. but who would like to see a such Internet?

  • by bloaf on 2/12/16, 4:46 AM

    Just gonna plug hostsman, which has been doing this on windows since forever:

    http://www.abelhadigital.com/hostsman

    Lets you chose which lists to use, and automatically update those lists. Also makes it easy to temporarily disable your rules if you need something that's blocked. Has a button for flushing the DNS cache.

  • by stephendicato on 2/12/16, 4:35 PM

    (Full disclosure: I run a service that blocks and intercepts malware communication using DNS! https://strongarm.io)

    Blocking via your hosts file has some great benefits; it works regardless of network and is relatively easy to update. Unfortunately, it doesn't scale easily to many systems or give you any insight into whether or not you are trying to connect to blocked domains.

    Blocking via DNS is a good alternative and is suggested multiple times in this thread. You can easily protect a whole network by setting your recursive resolvers and it works across any system.

    If you are interested in this and don't want to operate and maintain your own DNS (as well as pulling down various domain lists) check out https://strongarm.io. We manage DNS, aggregating lists of bad domains, and (most uniquely) will alert you if you try and talk to a blocked domain.

    It's free for personal use. We are a growing startup and love feedback from HN. Feel free to contact me directly as well! stephen[at]strongarm.io

  • by vbezhenar on 2/12/16, 7:49 AM

    I'm using my own VPN server and I setup unbound DNS server there. It's the only way for my old iPhone and iPad to browse internet without ads. And it's really fast. I use https://pgl.yoyo.org/adservers/ for ad servers list and a little awk script to convert it to unbound format.
  • by laumars on 2/12/16, 7:35 AM

    > Using 0.0.0.0 [instead of 127.0.0.1] is faster because you don't have to wait for a timeout.

    I'm not going to argue that localhost is better than 0, but that specific argument they've raised is incorrect. You don't have to wait for a timeout on localhost either. It will either fail instantly due to no listening processes on that IP and port, or it will connect to whatever process you have open on that address (eg a local instance of a http daemon).

  • by jedisct1 on 2/12/16, 1:32 PM

    Or use the DNSCrypt proxy. It has a module to filter DNS responses based on their name (full name or using expressions such as sex), or on the IP addresses they resolve to. Instead of returning 127.0.0.1, which can make you vulnerable to rebinding attacks, it returns responses with the standard "REFUSED" response code. https://simplednscrypt.org https://dnscrypt.org
  • by buro9 on 2/12/16, 3:42 PM

    I've been doing this on a Streisand created VPN server : https://github.com/jlund/streisand and use most of the same lists (though I've had to remove a few things from "Someone that cares" and I've also had to add a few things - I target apps too so I nuke some specific to mobile apps that are likely not in those lists).

    The reasons:

    * Block adverts in native mobile apps

    * Block adverts in mobile web browsing

    * Create a single connection for the mobile (reduce exposure to latency of new connections to different servers)

    * VPN connection keep-alive means I seldom reconnect

    * Side effect of mitigating risk of my telco screwing with my traffic or excessively logging metadata

    It works really, really well.

    I'm sure someone will say "battery!" but the cost of mobile adverts on batteries far outweighs the cost of connecting to a VPN.

    This is effectively adblock for mobile that works for all apps and websites.

  • by RKearney on 2/12/16, 4:01 AM

    Some of these map the domains to 127.0.0.1 which is wrong. It should be 0.0.0.0.

    On second thought, you shouldn't be using the hosts file for this at all.

  • by SixSigma on 2/12/16, 8:52 AM

    While this is ok as an idea, I prefer Privoxy [1] to get my ad blocking outside of the browser. It has the benefit that I can turn it on and off (I use a proxy switcher). It also means that I can have other devices use it either via LAN or SSH tunnel or whatever).

    [1] http://www.privoxy.org/

  • by Stamy on 2/12/16, 8:13 AM

    Whoever is using OpenWRT this is a great script for blocking hosts https://gist.github.com/teffalump/7227752. I guess it could be modified to use this source.
  • by riobard on 2/12/16, 8:13 AM

    Could someone please explain why advertisers do not re-use functional domain names to defeat domain-based filtering? I always find it fascinating that they still use such obvious ad-only (sub)domains to host assets.
  • by finishingmove on 2/13/16, 12:34 AM

    I've recently switched from Windows to Linux, and I feel kind of "naked", because I don't know what is the Linux equivalent of Windows Firewall + NOD32 + Common Sense. I've got ufw and AppArmor installed so far. Is using such a huge hosts file a common practice? Also, what about Flash? I don't want to install it, but some websites insist on it still. What would you advise, folks? I'd appreciate any suggestions.
  • by jcoffland on 2/12/16, 6:20 AM

    This already works quite well on my Android phone using AdAway.

    Edit: AdAway uses an /etc/hosts file.

  • by aapje on 2/12/16, 9:31 AM

    Wouldn't it be more efficient to add blocking on router level? Making all of your devices (at home) ad-free.

    Anyone exprerienced doing this?

  • by sosuke on 2/12/16, 4:52 AM

    I loved using Gas Mask for multiple host file management on OSX. Not so much for Ads but a great app I had trouble discovering.
  • by jakeogh on 2/12/16, 4:12 AM

    dnsagte[1] should use this as it's default source. Fixing.

    [1] https://github.com/jakeogh/dnsgate

  • by jdoss on 2/12/16, 5:59 AM

    I wanted a service for my laptop with custom blacklisting/whitelisting, blocking stats and a webserver to serve a blank HTML page for any domains in DNS list so I made:

    https://github.com/jdoss/dockerhole

    It was inspired by https://pi-hole.net/ and I am glad to see there are others making similar things to block Ads.

  • by stirner on 2/13/16, 12:02 AM

    I made a little C program that converts AdBlock Plus filter lists to hosts file entries: https://github.com/wwalexander/hostsblock

    There's a bit of an impedance mismatch since filter lists support some fairly advanced pattern matching while hosts file entries are obviously limited to specific domains, but it gets most domains.

  • by jsingleton on 2/12/16, 11:11 AM

    This technique works very well for blocking ads in Skype.

    You can also block the BBC Breaking News banner this way by adding polling.bbc.co.uk. Or if you want to play a prank use 192.30.252.153 as the IP. GitHub pages don't check if you own the domain.

    https://unop.uk/dev/breaking-the-news-blocking-the-bbc-news-...

  • by Borating on 2/12/16, 1:58 PM

    For GNU/Linux also check hostsblock [1]. It's available on aur.

    A pi-hole clone notrack [2]

    [1] https://gaenserich.github.io/hostsblock/ | [2] https://github.com/quidsup/notrack

  • by derFunk on 2/12/16, 8:22 AM

    I'm using http://pi-hole.net running on a Raspberry Pi. I use it as my home dns, it runs dnsmasq and points a list of a million ad hostnames to its own IP, answering every request with a blank HTML page.
  • by dbalan on 2/12/16, 12:41 PM

    Relevant: adblock with DNS Server https://hub.docker.com/r/kolyunya/afdns/

    This is for people who cannot edit /etc/hosts, but can change DNS server.

  • by dbg31415 on 2/12/16, 2:07 PM

  • by cdnsteve on 2/12/16, 1:12 PM

    Has anyone tried OpenDNS? It seems that a filtered DNS service is the way to go here.
  • by kamaal on 2/12/16, 5:58 PM

    This needs to be a larger effort, one hosts file updated hourly or any regular interval which we just configure to fetch and update through a cron job and forget.

    Would love to see some project like that.

  • by kup0 on 2/12/16, 3:13 AM

    Is this reasonably small enough not to cause performance issues? I notice it mentioned trying to keep the size more reasonable.

    Ad-blocking via hosts files can often lead to a noticeable performance hit.

  • by IgorPartola on 2/12/16, 1:56 PM

    Does anyone have a convenient way to convert this to a bind9 config format? I would rather run this for the whole LAN than just one computer at a time.
  • by LoSboccacc on 2/12/16, 10:05 AM

    I used too but windows tends to hang up when the host file gets overly long so had to abandon it when the advertiser grew too many.

    Has window 10 got better with that?

  • by simoncion on 2/12/16, 5:38 AM

    Another option is to stand up a DNS server that knows how to do something like BIND 9's Response Policy Zone [0][1].

    Although figuring out how to propagate RPZ changes to them isn't exactly straightforward (more on this below), if you're using BIND, you can set up views that match certain clients and provide one mix of RPZs to one set, and another to another set.

    On updating RPZs in a view (warning: BIND 9-specific instructions follow) :

    So, BIND has this nifty option for a zone called "in-view". This lets you say "The data for this particular zone lives in this other view, so when requests come in for this zone, in this view, use the data in this other view.". It might sound complicated, but it's really just a pointer to a pre-existing zone definition. This lets you define your master zones in one big "zone definition" view, and have client-specific views refer back to those definitions.

    However, you can't use in-view with RPZs. Why? Who knows? [2] But, what you can do is this:

    * Create one unique RNDC key per view

    * Add an allow-notify and match-clients entry in each view with that view's key

    * In the appropriate views, add a slave zone definition for each relevant RPZ, with localhost as the master, and whatever is your usual domain xfer key as the key [3]

    * Back up in your "zone definition" view, add to your also-notify list for each master RPZ definition an entry for localhost and each view key. [4] Having an ACL just for these RPZ slaves cleans up the RPZ definitions.

    Now you have dynamically updatable host blocking that can be deployed on a per-host basis, if you like. It's initially a bit more work than managing a local hosts file, but you can easily apply host blocking lists to any set of machines on your LAN, and you can programmatically update the RPZ lists with tools like nsupdate.

    [0] http://jpmens.net/2011/04/26/how-to-configure-your-bind-reso...

    [1] http://www.zytrax.com/books/dns/ch7/rpz.html

    [2] RPZs are handled just like regular zones in every other way except for this one. It's a bit frustrating.

    [3] This is actually less burdensome than it sounds, as you can write these slave zone definitions once and include the files containing the definitions in whatever view needs them.

    [4] That is, if you had three views, your also-notify list would have something like the following new entries: 127.0.0.1 key "view1-key"; 127.0.0.1 key "view2-key"; 127.0.0.1 key "view3-key"; You can have entries for just the views that use a given RPZ, but it doesn't hurt to have one ACL that notifies all views when any RPZ data changes.

  • by bitsoda on 2/12/16, 1:34 PM

    Don't the empty <div>s just get left behind taking up space when using this method?
  • by pette on 2/12/16, 10:28 AM

    It took me about a minute to find 12 false entries just by looking what lines end in .de
  • by DyslexicAtheist on 2/12/16, 4:12 PM

    very cool. It would be interesting to see this built into a (maybe raspberryPI) router and have a more central point/policy for configuration maybe together with caching (dnsmasq).
  • by pauly on 2/12/16, 10:04 AM

    been doing this for a long time with various facebook related domains - upsets people if they borrow my laptop though.
  • by Xophmeister on 2/12/16, 10:24 AM

    Whilst I'm sure this is no longer the case, I used to do this back in the day, but it was soooo slow!