by jergason on 3/8/16, 3:11 PM with 308 comments
by onli on 3/8/16, 3:32 PM
by Gratsby on 3/8/16, 3:52 PM
> CocoaPods is a dependency manager for Swift and Objective-C Cocoa projects. It has over ten thousand libraries and can help you scale your projects elegantly.
The developer response:
> [As CocoaPods developers] Scaling and operating this repo is actually quite simple for us as CocoaPods developers whom do not want to take on the burden of having to maintain a cloud service around the clock (users in all time zones) or, frankly, at all. Trying to have a few devs do this, possibly in their spare-time, is a sure way to burn them out. And then there’s also the funding aspect to such a service.
--
So they want to be the go-to scaling solution, but they don't want to have to spend any time thinking about how to scale anything. It should just happen. Other people have free scalable services, they should just hand over their resources.
Thank goodness Github thought about these kinds of cases from the beginning and instituted automatic rate limiting. Having an entire end user base use git to sync up a 16K+ directory tree is not a good idea in the first place. The developers should have long since been thinking about a more efficient solution.
by pjc50 on 3/8/16, 3:55 PM
"Not having to develop a system that somehow syncs required data at all means we get to spend more time on the work that matters more to us, in this case. (i.e. funding of dev hours)"
In other words, using github as a free unlimited CDN lets them be as inefficient as they like. Such as having 16k entries in a directory ( https://github.com/CocoaPods/Specs/tree/master/Specs ) which every user downloads.
Package management and sync seems to suffer really badly from NIH. Dpkg is over 20 years old and yum is over a decade old. What's up with this particular wheel that people keep reinventing it seemingly without improvement?
by indygreg2 on 3/8/16, 4:50 PM
Fortunately for us, Firefox is canonically hosted in Mercurial. So, I implemented support in Mercurial for transparently cloning from server-advertised pre-generated static files. For hg.mozilla.org, we're serving >1TB/day from a CDN. Our server CPU load has fallen off a cliff, allowing us to scale hg.mozilla.org cheaply. Additionally, consumers around the globe now clone faster and more reliably since they are using a global CDN instead of hitting servers on the USA west coast!
If you have Mercurial 3.7 installed, `hg clone https://hg.mozilla.org/mozilla-central` will automatically clone from a CDN and our servers will incur maybe 5s of CPU time to service that clone. Before, they were taking minutes of CPU time to repackage server data in an optimal format for the client (very similar to the repack operation that Git servers perform).
More technical details and instructions on deploying this are documented in Mercurial itself: https://selenic.com/repo/hg/file/9974b8236cac/hgext/clonebun.... You can see a list of Mozilla's advertised bundles at https://hg.cdn.mozilla.net/ and what a manifest looks like on the server at https://hg.mozilla.org/mozilla-central?cmd=clonebundles.
A number of months ago I saw talk on the Git mailing list about implementing a similar feature (which would likely save GitHub in this scenario). But I don't believe it has manifested into patches. Hopefully GitHub (or any large Git hosting provider) realizes the benefits of this feature and implements it.
by jdcarter on 3/8/16, 3:36 PM
One correction to the post title: it's not maxing five nodes, but five CPUs.
by web007 on 3/8/16, 4:27 PM
Even Finder or `ls` will have trouble with that, and anything with * is almost certainly going to fail. Is the use-case for this something that refers to each library directly, such that nobody ever lists or searches all 16k entries?
by mikeash on 3/8/16, 6:39 PM
Think about it from their perspective. GitHub advertises a free service, and encourages using it. Partly it's free because it's a loss leader for their paid offerings, and partly it's free because free usage is effectively advertising GitHub. CocoaPods builds builds their project on this free service, and everything is fine for years.
Then one day things start failing mysteriously. It looks like GitHub is down, except GitHub isn't reporting any problems, and other repositories aren't affected.
After lots of headscratching, GitHub gets in touch and says: you're using a ton of resources, we're rate limiting you, you're using git wrong, and you shouldn't even be using git.
That's going to be a bit of a shock! Everything seemed fine, then suddenly it turns out you've been a major problem for a while, but nobody bothered to tell you. And now you're in hair-on-fire mode because it's reached the point where the rate-limiting is making things fail, and nobody told you about any of these problems before they reached a crisis point.
It strikes me as extremely unreasonable to expect a group to avoid abusing a free service when nobody tells them that it's abuse, and as far as they know they're using it in a way that's accepted and encouraged. If somebody is doing something you don't like and you want them to stop, you have to tell them, or nothing will happen!
I'm not blaming GitHub here either. I'm sure they didn't make this a surprise on purpose, and they have a ton of other stuff going on. This looks like one of those things where nobody's really to blame, it's just an unfortunate thing that happened.
(And just to be clear, I don't have much of a dog in this fight on either side. My only real exposure to CocoaPods is having people occasionally bug me to tag my open source repositories to make them easier to incorporate into CocoaPods. I use GitHub for various things like I imagine most of us do, but am not particularly attached to them.)
by wpeterson on 3/8/16, 4:09 PM
What seems insane is to use a single github repo as the universal directory of packages and their versions driving your package manager.
There's a reason rubygems has their own servers and web services to support this use case for the central library registry, even if the source for gems are all individually projects hosted on github.
by riscy on 3/8/16, 4:35 PM
The CocoaPods developers seem to be missing the entire point of git: it's a _distributed_ revision control system.
Setup a post-recieve hook on Github to notify another server, that is setup with a basic installation of git, to pull from Github so as to mirror the master repo. Then, have your client program randomly choose one of these servers to pull from at the start of an operation. Simple load balancer to solve this problem.
by spoiler on 3/8/16, 4:32 PM
> GitHub Support is unable to help with issues specific to CocoaPods/CocoaPods.
---
by rmoriz on 3/8/16, 5:15 PM
by zymhan on 3/8/16, 3:50 PM
by iBotPeaches on 3/8/16, 4:06 PM
by paradite on 3/8/16, 7:13 PM
- My school uses GitHub to host and track our software engineering project (which still can be argued as OSS).
- People using GitHub issue system as a forum.
- Friends uploading pdfs to GitHub.
- Recently people posted on HN about using GitHub to generate a status page.
I think this is a really bad trend and people should stop doing that.
by fpgaminer on 3/8/16, 8:12 PM
Imagine a world where GitTorrent is fully developed, includes support for issue tracking, and has a nice GUI client that makes the experience on-par with browsing github.com.
I mention this not as an "Everybody bail out of GitHub and run to GitTorrent!!!" sort of statement, because I believe GitHub's response here was excellent and confidence inspiring. But it's an unnatural relationship for community supported, open source projects to host themselves on commercial platforms such as GitHub. GitHub primarily hosts them to promote its business. That's not necessarily a bad thing, but it results impedance mismatches like demonstrated here.
That isn't to say that a mature GitTorrent would replace GitHub. Rather, I envision GitHub becoming a supernode in the network, an identity provider, and a paid seed offering, all alongside their existing private repo business.
Honestly, once I scrape a few projects off my plate, I'm inclined to dive into GitTorrent, see where it's at in development, and see if I can start contributing code. It just seems like such a cool and useful idea.
by pavlov on 3/8/16, 3:56 PM
The potential downsides seem much more annoying. Do you really want to have your dependencies on an overloaded central server somewhere?
by jrochkind1 on 3/8/16, 4:04 PM
by sdegutis on 3/8/16, 3:37 PM
by ak217 on 3/8/16, 4:36 PM
by tjdetwiler on 3/8/16, 6:21 PM
by iamleppert on 3/8/16, 7:02 PM
by maaku on 3/8/16, 9:53 PM
by xemdetia on 3/8/16, 4:44 PM
by noahlt on 3/8/16, 5:59 PM
by SuperKlaus on 3/8/16, 4:21 PM
by fokinsean on 3/8/16, 4:23 PM
$ git fetch --depth=2147483647
by kodablah on 3/8/16, 5:28 PM
by rcthompson on 3/9/16, 1:36 AM
by kmm on 3/9/16, 2:22 AM
by soheil on 3/8/16, 6:44 PM
I had no idea it was just CocoaPods repo because my other repos were working fine. I accepted defeat, went to bed and everything was working great in the morning.
by sly010 on 3/8/16, 4:18 PM
by zoul on 3/8/16, 3:37 PM
by Negative1 on 3/8/16, 4:10 PM
by superuser2 on 3/8/16, 3:57 PM
by joeblau on 3/8/16, 5:40 PM
by debacle on 3/8/16, 6:13 PM
Seems like a poor design decision on the CocoaPods side.
by voltagex_ on 3/9/16, 3:03 AM
by LoneWolf on 3/9/16, 5:51 PM
by nimish on 3/8/16, 4:42 PM
by rdancer on 3/9/16, 2:53 PM
by speps on 3/8/16, 5:24 PM
by whitehat2k9 on 3/8/16, 7:44 PM
by Const-me on 3/9/16, 1:35 AM
Short term sure, they’re doing the right thing, implementing a nice way to manage the free rider problem without hurting them too much.
But long term it’s different.
Financially, one average programmer = $80k/year, one average cloud server = $4k/year. And, GitHub has hundreds of millions of venture capital. More than enough to provision a few more servers, even if they will be installing new servers just for those pods.
The way they act now will lead to someone will develop a decentralized git+torrent hybrid. When that happens, sure, those pods will no longer consume precious GitHub’s resources. However, for the rest of the github users, there will be no reason to stay on GitHub either.