from Hacker News

Page Weight Matters

by zacman85 on 12/23/12, 12:29 AM with 164 comments

  • by jerrya on 12/23/12, 1:39 AM

    I'm on a pretty fast line, and I am sometimes appalled with how long it takes pages to load, and often it comes down to loading content I have no interest in:

      * comments
      * ads and images
      * headers
      * sidebars
      * toolbars
      * menus
      * social network tools
      * meebo bars (really google, really?)
      * javascript to load all of the above crap
    
    The amazing thing is how often I can't even begin to read the page for what seems like much of a minute as the page takes so long to render, or various pieces of the page jump, and shift and scroll.

    I find that tools that localhost various ad servers help, and other tools that load the crap but keep it off the page help like adblock plus, but even more so, adblock plus' filters that let me shitcan all the crap.

    One of these days I want to write an extension similar to adblock plus that seeks out and removes jquery crap. A lot of the reasons I can't read pages anymore seems to be jquery slideshows, jquery toolbars, jquery popups and the like.

    I am pretty sure that graphing this out and we find the end of the web occurs sometime in 2018 when page designers and their bosses and engineers and marketing pukes have so larded down pages that the net runs out of available bandwidth and any page takes 4:33 to load.

  • by nikcub on 12/23/12, 9:08 AM

    There are a few ways to see this data in Google Analytics. You can view a world map of your site with page load time and server connection time color-coded.

    The first, and easiest way is to go to Content > Site Speed > Overview. By default this will show you a chart of page load time over time.

    First, to get enough data change the time scale to a full year. Underneath the date picker there is an icon with 16 dots in it in a 4x4 arrangement, with some dots filled in. click on that and move the slider all the way to the right. This will ensure higher precision and will capture some of the slower page loads.

    At the bottom, in the 'Site Speed' section instead of 'Browser' select 'Country/Territory'. It will change the data from pages to countries. Now click on 'view full report' and you will get a world map with page load times.

    It will look something like this:

    http://i.imgur.com/uCxcx.png

    The site I just did it on doesn't have enough data, but if you have a fairly popular site you should see a nice variation in page load times.

    Google have a post about this on their Analytics blog with much better maps and more information:

    http://analytics.blogspot.com.au/2012/04/global-site-speed-o...

    Their maps look a lot better.

    The other way of doing it is to create a custom report. I have one called 'Global Page Load'. Add 'Country/Territory' as a dimension, and 'City' as a sub-dimension. You may also drill in further by adding browser version, mobile, javascript, etc.

    As metric groups, I have - in this order: DNS Lookup time, Avg Server Connection Time, Avg Server Response Time, Avg. Page Load Time.

    This then gives you a pretty report where you can immediately see which visitors are getting slow responses, and you can further drill in to see what type of connections and which browsers or devices are slow. I was surprised that my light page, with compressed CSS and every static/cached was still taking ~20seconds to fully load from 30% of countries.

    That is for pages that load, you need to add another tab to the report with error metrics to see who isn't getting through at all, and you would need to look at server logs to see who isn't even loading the Google Analytics javascript. All very handy, and eye opening in terms of the types of connections and the speed of connections that a lot of web users are on.

    To many sites are guilty of having pages that are just far too heavy - like they only test from their 100Mbit city based connections. I am in Australia with a 1st world internet connection at 24Mbit and I avoid theverge.com desktop site because of the page load.

    Edit: If anybody can work out a way to share custom reports in Google Analytics, let me know - I would be interested in sharing reports with others, for specific cases such as this.

  • by rwg on 12/23/12, 8:12 AM

    Since my home router and wireless access points all run Linux, I can play all kinds of sadistic games with routing, tunneling, traffic shaping, etc.

    One of the experiments I did a while back was creating a "satellite Internet connection from halfway around the globe" simulator on a dedicated wireless SSID. Basically, I created a new SSID and used Linux's traffic control/queueing discipline stuff to limit that SSID's outbound throughput to 32 kbps, limit inbound throughput to 64 kbps, add 900 milliseconds of latency on sent and received packets, and randomly drop 4% of packets in/out. Very, very few sites were even remotely usable. It was astonishing.

    I think one of the most useful products that could ever be created for web developers is a "world Internet simulator" box that sits between your computer and its Internet connection. (Maybe it plugs into your existing wireless router and creates a new wireless network.) It would have a web interface that shows you a map of the world. You click a country, and the simulator performs rate shaping, latency insertion, and packet loss matching the averages for whatever country you clicked. Then devs can feel the pain of people accessing their websites from other countries.

    (Thinking about this for a minute, it could probably be done for about $30 using one of those tiny 802.11n travel routers and a custom OpenWRT build. It would just be a matter of getting the per-country data. Hmmmmm...)

  • by aaronjg on 12/23/12, 3:59 AM

    This is an interesting example of why randomization in experiments is important. If you allow users to self select into the experiment and control group, and then naively look at the results, the results might come up opposite from what is expected. This is known as Simpson's Paradox. In this case, it was only the users for whom page load was already the slowest that picked the faster version of the page. So naively looking at page load times made the pages look like they loaded slower.

    However once Chris controlled for geography, he was able to find that there was a significant improvement.

    Moral of the story: run randomized A/B tests, or be very careful when you are analyzing the results.

  • by morpher on 12/23/12, 2:40 AM

    In addition to the developing world, page weight is a big concern for those of us with limited data plans. When a single log post weighs in at 2-3 MB (which is common), the day's allotment of a 200MB/mo plan disappears really fast. I often end up only reading the HN comments while on the bus since I don't want to play Russian roulette with the article size.

    (of course, this is unrelated to YouTube, but to the general sentiment of the article)

  • by habosa on 12/23/12, 1:53 AM

    I like this article a lot, it makes a good point that many of us often overlook. What's the dominant feeling about ajax to reduce page weight? It is ok if my page is 1MB but only the initial 100KB are sent synchronously and necessary for interaction? An example would be YouTube sending just the main page layout and video player and then async loading all of the related stuff and comments.
  • by marcamillion on 12/23/12, 6:49 AM

    This story reminds me of a founder I met recently, building a wildly successful project on the most meager of resources (continue reading, I include Google Analytics screenshots below).

    This is a hacker in the truest sense of the word. He builds stuff just because he loves building stuff. He is a student at a University here in Jamaica. He isn't building it to be cool, or chasing the latest web 2.0 fad of the week. He didn't know HN until I introduced it to him, and he is young (say 19/20).

    He built http://wapcreate.com - a WAP site creator.

    That's right, WAP...not iOS or Android optimized HTML5 sites. Good, old fashioned WAP sites.

    The most amazing part of the story though is that he is running it on 2 dedicated servers in Germany, it's a hack job (PHP, a bit of Ruby & a bit of Java). But once he picked up traction, he got so much traffic that he hasn't been able to keep the servers online.

    In the first image - http://i.imgur.com/yEbyh.png - you will see that he got over 1.5M uniques. The vast majority of the time period covered here (the last year) was either very low traffic - pre-traction - or servers offline due to excessive traffic.

    In the 2nd image - http://i.imgur.com/Pu8da.png - you will see that about 1.2M of those visits were in the 3 month period of June 1st - Aug 31st. His servers melted down towards the end of August and he ran out of money to pay his server bills. He eventually got it back up again a few weeks later, and the traffic spiked again and the servers crashed a few weeks later again.

    In the 3rd image - http://i.imgur.com/HJ4gy.png - you will see that the vast majority of the visits are from Asia (even though he is 1 guy in the rural areas of Jamaica).

    In the 4th image - http://i.imgur.com/JSQ48.png - and perhaps the most striking you will see the diversity of devices that the visitors are coming from. Most of them are from "feature phones". i.e. A multitude of versions of Nokia phones. Notice that this is just 1 - 10 of 508 device types.

    He presented at a conference I went to, here in Jamaica, and he and I started speaking. I am helping him figure out how to proceed in a sustainable way. i.e. getting this thing stable, and then generating revenue.

    After speaking to him for many weeks, I finally realized how insane his accomplishment is. Apparently, in all of this, he had been creating his site on computers that were not his. He either used his school computers, or borrowed machines from people. His Aunt is buying him a 2nd hand Thinkpad for Christmas - for which he is EXTREMELY stoked.

    So while we are all chasing the billions doled out by Apple on the App Store and the newest, sexiest SaaS app idea with fancy $29/mo recurring revenue, with our cutting edge macbook pros and iPads - here is one guy using borrowed hardware, making a creation app for a technology that we have long since forgotten, generating crazy traffic and usage and struggling to even make a dime from his creation.

    The world is a funny place, and this internet thing that we live on - is massive. As big as TechCrunch & HN are, there is so much more out there.

    If you think you can help out in any way, either donating computing resources or anything else that can help us get this site back online and helping him start to generate revenue from this - then feel free to reach out to me.

    P.S. If you want to dig into it some more, check out what some of the fans of the site are saying on it's FB page. I am not trying to trick you into liking the page. It only has ~1500 likes, but you can see passionate users commenting (both complaining about the downtime and praising some of the features).

  • by justnoise on 12/24/12, 12:51 AM

    I wanted to post this last night but the internet went down in Vientiane (again) so I just went to sleep.

    Having spent the last couple of months connecting from various parts of SE Asia, I'm happy to see this getting the attention it deserves. You don't realize how utterly painful or unusable much of the web is until you experience a network that is not only incredibly slow (by US standards) but also incredibly unreliable and prone to 5-20 minute outages many times per hour or day depending on where you are.

    Before today, I hadn't clicked on a youtube video in months. The site was unusable.

    Most evenings in smaller cities (not villages, that's a different story entirely) the broadband will be OK with downloads averaging around 3-15KB/sec. This allows you to access most sites or check gmail (usually I still opt for HTML mode). However, it's the frequent and intermittent outages that cause the most headaches. In many places I've visited they happen a couple of times per day and in other places, they happen many times per hour. These outages are a larger problem for AJAX heavy sites where clicking a button doesn't give a page saying "This webpage is not available" instead, the site just sits there, waiting for the request to complete, giving the user no indication that the rest of the internet is unreachable.

    I have no idea what causes these outages (it's not the local network) but they lead to an awful experience on some sites. Hacker News, Wikipedia and other places do a great job of providing useful info that can be retrieved in a reasonable amount of time, countless others are so fat I've stopped visiting them entirely.

    Think about the people and parts of the world you'd like to reach and design your site to have an enjoyable experience in those areas. I'm grateful for features like gmail's HTML only mode. I only wish other heavy sites had an equivalent feature.

  • by plouffy on 12/23/12, 1:08 AM

    I must not be very clever as it took me 3-4 attempts to understand the before last paragraph (e.g. that without Feather it took 20 minutes while with Feather only 2 minutes to load a video).
  • by WA on 12/24/12, 10:24 AM

    What I don't get from the article:

    Why could people in low-bandwith regions suddenly watch YouTube videos? Having the website itself down to 98KB is fine (which still took 2 minutes to load in some areas). But then you only get the first video frame.

    So then you'd go and watch a video and that file is still the same size (several megabytes at least), which should take them still hours to load the entire video.

    I understand that less requests and a smaller page weight does matter and I try to keep my webpages as light as possible, but YouTube seems to be the least interesting example for that, since the content itself can't be reduced much further without reducing either the image quality or the screen size.

  • by orangethirty on 12/23/12, 3:24 AM

    Re: Page Weight.

    I spend a bit more than a month studying how to build Nuuton in a way that it did not end up being a front-heavy piece of crap. Looked at a lot of differents options. Considered a client side Javascript MVC frameworks. Looked at the CSS side of things. Went as far as thinking about developing our own little app for Nuuton (No browser needed). But at the end, the simplest choice won. I decided to go with flat HTML/CSS. No bootstrap or framework either. Plain old markup and stylesheets. It uses very little Javascript (no Jquery either), just what the simple UI requires (an event listener for the enter key, and an event listener for 4 links). Nothing else. No animations. No fancy scrolling. No caroussels. Nothing like that. Its rendered on the server side (Django), and gets served to you in a very small package. I even went as far as breaking down the stylesheets so that you dont need to download CSS that does not apply to the page you are visiting. And you know what? It works, feels, and looks amazing. Even if the project ends up being a failure, acheiving such little victories are just deliciously fun.

    Where did I get the inspiration to go against the current industry trends? HN's simple yet functional html setup. My god, its features a million tables, but the damn thing just works beautifully. By the way, Nuuton also uses tables. :)

  • by lilsunnybee on 12/23/12, 9:20 PM

    It would be nice if newer browsers could start including average speed data in the http header. Then you could have servers that would offer up more or less weighty versions of the website based on projected load times. For browsers not attaching this header info, you could just assume the worst case scenario.

    Even if maintaining multiple versions of the site would be a pain, it would at least be nice to do something like this during a site redesign. If a user has a fast enough connection, and load times aren't projected to be terrible, serve up the new heavier site. If not, serve up the older site. Then you could get some metrics about what percentage of your users still need the old design around, and how important it would be to cut weight on a redesign if needed.

    I'm a technical lightweight so i might have messed up some of the details, but hopefully the idea is still interesting.

  • by nnq on 12/24/12, 9:08 AM

    > traffic from places like Southeast Asia, South America, Africa, and even remote regions of Siberia

    ...and now someone will figure out that this traffic isn't generating any revenue and you'll be back to catering the folks with high bandwidth and deep pockets.

  • by Sami_Lehtinen on 12/24/12, 4:59 AM

    Posting from Egypt. On a good day I might be able to download 100kB/s, usually during evenings it's more like 10kB/s with high packet loss & latency.

    I often use Opera Mini, because most browsers and sites are just unbearably slow or do not work at all.

  • by dsr_ on 12/23/12, 12:06 PM

    All of you people pointing out that AdBlock and NoScript and using w3m really helps reduce load times:

    you're right. And missing the point.

    A large chunk of the world is permanently bandwidth starved, and most of the mobile world lives on transfer caps and long latencies. If those are plausible scenarios for your audience, you need to reduce the quantity of invisible bits you are sending them.

    What's an invisible bit? Any bit that does not directly create text or images on the screen is invisible. Anything that runs client-side. Some of it may be necessary, but most of it is probably boiler-plate or general-purpose when what you actually need is more limited. Reduce!

  • by tomfakes on 12/23/12, 5:58 AM

    There's a great resource for testing page load speed for different browsers/locations/connection speeds.

    Frankly, I'm stunned that they run this at no cost.

    http://www.webpagetest.org

  • by keithpeter on 12/23/12, 10:18 AM

    I live less than one and a half miles from the centre of a major European city. We currently get 0.24Mbits/s download and 0.37Mbits/s upload on ADSL over the phone line. My broadband provider's help line have confirmed no faults on line, or with my router (I've swapped the cables, router, and tried several computers). They think its an automatic profiling change because we are often away from home and switch the router off. I'm battling my way through the help line 'levels' which is proving to be 'interesting'. I may go to a smaller more expensive provider in the new year just so I can run an rdp session and get an interface I can work on.

    The low bandwidth experiment has been educational. On Firefox/Ubuntu, you get the little status bar at the bottom that shows the requests. Some pages have a lot of those, and take ages to load. Distro-hopping is feasible (I'm trying out different interfaces), a CD-ROM downloads overnight quite easily. Software updates are a killer (go and have dinner, listen to some music...).

    I've started using a command line web client (w3m in my case) to get to the text more quickly. You get to see which sites have 'skip to content' links embedded early in the page and with CSS set to not show them on Firefox. You also get to see which sites provide all content via html and which sites 'wrap up' their content in javascript functions &c.

    As many here provide Web content and applications, just try a command line Web browser on your site...

  • by KMBredt on 12/23/12, 10:03 AM

    Feather even exists today: http://www.youtube.com/feather_beta and works as far as I can tell. You can opt into it using the link.
  • by twistedpair on 12/24/12, 5:41 PM

    This has been available in the Google Web Toolkit since 2006. I'm not sure why a Google employee would not have been aware of it. If you use GWT, only the bare minimum of resources (JS) are loaded to render the landing page. Further JS will be loaded for subsequent pages/actions as needed. Images and CSS are inlined too so that there are no requests for them. "Perfect caching" of resources is built in through MD5 names for resources so that with a simple apache filter you never download static content twice.

    I'm glad to Feather worked so well. It always drives me nuts when some major site/portal needs 250+ requests to load. If you like the sound of it (only 3 requests to load a complex app), checkout GWT as you get all of these optimizations (i.e. Closure) and more by just compiling your web app.

  • by gregpilling on 12/23/12, 2:43 PM

    I had this argument about 4 years ago with a young colleague who said I was out of date, and I am not a developer. Brett Tabke of Webmasterworld.com has been arguing for years that lightweight matters. It used to matter for dialup in the USA, now with widespread broadband it is less important to those people. Now it is important for mobile users and global users on dialup or slow connections. In a few years it will matter to the people that got their first android phone in some country with poor infrastructure.

    Even personally it matters to me :) since I live in a 50 year old house that has slow DSL (YouTube buffers constantly) and I live in the center of Tucson, AZ.

    I recall Google doing a study that increasing the speed of browsing increases the use of the internet. Page weight will always matter to someone.

  • by ghshephard on 12/23/12, 1:17 AM

    While I appreciate (and agree with) the overall theme (Make your pages light, so people on < 10mbit/second links have access them, opening them up as a market to you - this works well for people on older GPRS/GSM links in the developing world) - I don't understand the math. Presuming there are parts of the world still stuck in 1997 analog modem technology, then 100 Kilobytes / 56.6 kilobits/second = 14 seconds.

    At two minutes, there are people out there with 6.6 kilobit links connecting to youtube?

    I suspect there might be a bit of hyperbole in this article as well, because, even if there were people connecting on ultra slow links, the average latency of those connections is likely to be be wiped out by the tens of millions of broadband links.

  • by grego on 12/23/12, 8:22 AM

    Indeed, many who extol the virtues of cloud computing somehow have a blind spot to the fact that network coverage on this globe is not even. Actually it would be nice to see a color coded world map of network strength, is there one?

    Bad network coverage is a problem that can be addressed many ways. Shameless plug: I've been working on offline wikipedia that by its nature is always available and fast, everywhere: http://mpaja.com/mopedi I'm sure there are many unexplored niches where the cpu power in your hand can be put to good use without needing the network.

  • by spyder on 12/23/12, 9:29 AM

    What about compression? Is he talking about gzipped size or the uncompressed page?
  • by erickhill on 12/24/12, 7:31 PM

    My experience is this is most applicable for medium to large sites with very healthy monthly uniques. But it is excellent advice for all, particularly as more and more mobile devices access your site.

    Some excellent best practices from the Yahoo team here on how to increase front-end site performance (which can account for "80% of the end-user response time"): http://developer.yahoo.com/performance/rules.html

  • by ricardobeat on 12/23/12, 1:17 AM

    > places like Southeast Asia, South America, Africa

    Large swaths of South America have broadband access. I guess South Africa does too. Being a little more specific would be helpful.

  • by cedricd on 12/23/12, 4:24 PM

    This post makes a ton of sense. I spent a year traveling in the developing world a couple years back and was literally unable to watch a YouTube clip the entire time. It's amazing what parts of the Internet you can't see the moment you're on an insanely slow connection.

    I was in Myanmar for a bit and their internet was so slow that I couldn't check even news or email -- no need for a China-style firewall.

  • by bevan on 12/24/12, 11:06 PM

    This post prompted me to commission a script that calculates itemized page weight: https://bountify.co/ruby-script-to-display-itemized-page-wei....
  • by winter_blue on 12/23/12, 4:10 AM

    Oh, I totally know it does because I've been stuck on and off on slow connections recently -- and a lot of pages (HN is big and awesome exception), load quite slowly. I even ended up using gmail's HTML interface for a while.
  • by Eliezer on 12/23/12, 5:36 AM

    Simpson's Paradox ftw!
  • by gprasanth on 12/24/12, 8:15 PM

    I thought this was the reason apps existed: to cut down loading time. I hate it so much when someone, in their app embeds a browser and load their site.
  • by dainix on 12/23/12, 9:06 AM

    We are working on the same thing, get site as lightweight as possible! Oh man, thinking that for somebody site takes 20 min to load makes me cringe!!!
  • by davars on 12/23/12, 1:51 AM

    So the code to load the videos came in under 100KB. How large were the videos themselves?
  • by drivebyacct2 on 12/23/12, 6:14 AM

    Can someone with data give me a comparison of the people with the slowest connections and which browsers they use?
  • by jarcoal on 12/23/12, 1:17 AM

    I'm not really sure this post provides any useful information.