from Hacker News

The etcd operator: Simplify etcd cluster configuration and management

by polvi on 11/3/16, 10:27 PM with 35 comments

  • by darren0 on 11/3/16, 11:07 PM

    I would love to understand the design rational on why a custom controller is needed to run etcd as opposed to leveraging the existing k8s constructs such as replicasets or petsets. While this is a very useful piece of technology it gives me the wrong impression that if you want to run a persistent or "more complicated" work load then you must develop a significant amount of code for that to work on k8s. Which I don't believe is the case, which I why I'm asking why this route was chosen.
  • by hatred on 11/3/16, 11:32 PM

    The concept of custom controllers looks similar to what schedulers are in Mesos. It's nice to see the two communities taking a leaf out of each other's books e.g., Mesos would introduce experimental support for task groups (aka Pods) in 1.1.

    Disclaimer: I work at Mesosphere on Mesos.

  • by ex3ndr on 11/4/16, 2:39 AM

    Can someone clarify some points?

    * Isn't etcd2 is required to start kubernetes? I found that if etcd2 is not helaty or connection is just temporary lost then k8s just freezes it's scheduling and API. So what if Operator and etcd2 is working on one node and it is down? Also i found that etcd2 also freezes event when one node is down. Isn't it unrecoverable situation?

    * k8s/coreos manual recommends to have etcd2 servers not that far from each other mostly because it have very strict requirements about networks (ping 5ms or so) that for some pairs of servers couldn't work well.

    * What if we will lost ALL nodes and it will create almost new cluster from backups, but what if we will need to restore latest version (not 30 mins ago)?

  • by jbpetersen on 11/3/16, 11:01 PM

    Being someone who's been getting more familiar lately with backend engineering and has been trying to make sense of various options, I've got a strong enough impression of CoreOS that I'm betting my time it'll be dominating the next few years.

    I also can't wait to see an open version of AWS Lambda / Google Functions appear.

  • by russell_h on 11/3/16, 10:57 PM

    I've been thinking about implementing a custom controller that would use Third Party Resources as a way to install and manage an application on top of Kubernetes. The way that Kuberetes controllers work (watching a declarative configuration, and "making it so") seems like a great fit for the problem.

    Its exciting to see CoreOS working in the same direction - this looks much more elegant than what I would have hacked up.

  • by dantiberian on 11/3/16, 10:58 PM

    This sounds a lot like Joyent's Autopilot Pattern (http://autopilotpattern.io), but will be more integrated with Kubernetes, rather than being agnostic.
  • by adieu on 11/3/16, 11:23 PM

    This is great news. We developed an internal controller managing etcd cluster used by kubernetes apiserver using third party resource too. The control loop design pattern works really well.
  • by why-el on 11/4/16, 2:08 AM

    Somewhat unrelated, but I am just curious. For those who use etcd (and this is coming from a place of ignorance), does the key layout (which the keys are currently stored, how they are structured) get out of hand? Meaning, does it get to a place where a dev working with etcd might not have an idea about what in etcd at any given time? Or do teams force some kind of policy (in documentation or code) that everyone must respect?

    I am asking because I was in situation where I was introduced to other key-value stores, and because the team working with them is big and no process was followed to group all keys in one place, it was hard to know "what is in the store" at any moment, short of exhausting all the entry points in the code.

  • by NegatioN on 11/4/16, 12:16 AM

    I see it mentioned in the article that they have created a tool similar to Chaos Monkey for k8s, but I don't see any resources linking to it.

    Will this at some point be available publically? Although k8s ensures pods are rescheduled, many applications do not handle it well, so I think a lot of teams can benefit from having something like that.

  • by hosh on 11/3/16, 11:43 PM

    This is brilliant. It's like the promise-theory-based convergence tools (CFEngine, Puppet, Chef) on top of K8S primitives. Better yet, the extension works like other K8S addons -- you start it up by scheduling the controller pod. That means potentially, I could use it in say, GKE, which I might not have direct control over the kube-master.

    I wonder if it is leverging PetSets. I also wonder how this overlaps or plays with Deis's Helm project.

    I'm looking forward to some things implemented like this: Kafka/Zookeeper, PostgreSQL, Mongodb, Vault, to name a few.

    I also wonder it means something like Chef could be retooled as a K8S controller.

  • by otterley on 11/4/16, 1:36 AM

    Where is the functional specification for an Operator? It sounds like a K8S primitive; is that in fact true? If not, why does this post make it sound like one?