from Hacker News

Ask HN: How do you keep track of releases/deployments of dozens micro-services?

by vladholubiev on 1/17/18, 9:33 AM with 52 comments

by dhinus on 1/17/18, 2:10 PM
Our apps are made by 5-15 (micro)services. I'm not sure if this approach would scale to hundreds of services managed by different teams.
We store the source code for all services in subfolders of the same monorepo (one repo <-> one app). Whenever a change in any service is merged to master, the CI rebuilds _all_ the services and pushes new Docker images to our Docker registry. Thanks to Docker layers, if the source code for a service hasn't changed, the build for that service is super-quick, it just adds a new Docker tag to the _existing_ Docker image.
Then we use the Git commit hash to deploy _all_ services to the desired environment. Again, thanks to Docker layers, containers that haven't changed from the previous tag are recreated instantly because they are cached.
From the CI you can check the latest commit hash that was deployed to any environment, and you can use that commit hash to reproduce that environment locally.
Things that I like:
- the Git commit hash is the single thing you need to know to describe a deployment, and it maps nicely to the state of the codebase at that Git commit.
Things that do not always work:
- if you don't write the Dockerfile in the right way, you end up rebuilding services that haven't changed --> build time increases
- containers for services that haven't changed get stopped and recreated --> short unnecessary downtime, unless you do blue-green
by alex_duf on 1/17/18, 11:35 AM
At the guardian we use https://github.com/guardian/riff-raff
It takes a build from your build system (typically team city, but not exclusively) deploys it and record the deployment.
You can then check later what's currently deployed, or what was deployed at some point in time in order to match it with logs etc.
Not sure how useable it would be outside of our company though.
by perlgeek on 1/17/18, 4:17 PM
We have separate repos for each service, and use https://gocd.org/ to build, test and deploy each separately. But, you could also configure it to only trigger builds from changes in certain directories. There is a single pipeline template from which all pipelines are instantiated.
Independent deployments are one of the key advantages of microservices. If you don't use that feature, why use microservices at all? Just for scalability? Or because it was the default choice?
by joshribakoff on 1/17/18, 12:43 PM
My experience with micro-services is code-bases that have prematurely adopted the pattern. Based on this, my advice is as follows...
You can deploy the whole platform and/or refactor to a monolith, and maintain one change log which is simple.
That however has its own downsides, so you should find a balance. If you're having trouble keeping track, perhaps re-organize. I read on one HN article that Amazon had 7k employees before they adopted microservices. The benefits have to outweigh the costs. Sometimes the solution to the problem is taking a step back. without more details its hard to say.
So basically one option is refactor [to a monolith] and re-evaluate the split such that you no longer have this problem. Just throw each repo in a sub-folder & make that your new mono-repo & go from there, it is worth an exploratory refactoring, but not a silver bullet.
by vcool07 on 1/17/18, 10:31 AM
Something called 'integration testing' that has to be done before the final build which clearly flags off any compatibility issues between components.
Every component comes with a major/minor release no., which tells about the nature of change that has gone in. For ex: Major rel is incremented for a change that usually introduces a new feature/interface. Minor release no are reserved for bug fixes/optimizations, that are more internal to the component.
The build manager can go through the list of all the delivered fixes and cherry pick the few which can go to the final build.
by bootcat on 1/17/18, 12:15 PM
In the company i worked for, they had their own CI/CD system which tracked information about each service and the systems onto which it has to deploy. Once it was all configured, it was basically button pushes. Also the system tracked feedback after deployment to confirm if the build went good or needed to be fixed - if certain parameters were unwell, basically it did a role back ! Also there were canary deployments to make sure code was deployed only to portion of systems to make sure it indeed pushed correctly and worked. If not, they are rolled back !
by wballard on 1/17/18, 12:57 PM
We’ve been using our own setup for 4 years now. https://github.com/wballard/starphleet
We have 200 services, counting beta and live test variants. Most of the difficulties vanished once we had declarative versioned control of our service config in the ‘headquarters’ repository.
Not aware of anyone else using this approach.
by whistlerbrk on 1/17/18, 1:15 PM
In the past I've used a single repo with all the code which gets pushed everywhere, and each service only runs it's portion of the code. No guess work involved, but this may not work for a lot of setups of course. That and your graceful restart logic has to be slightly more involved.
by twic on 1/17/18, 2:44 PM
At an old company, we wrote this, er "model driven orchestration framework for continuous deployment":
https://github.com/tim-group/orc
Basically, there's a Git repo with files in that specify the desired versions and states of your apps in each environment (the "configuration management database").
The tool has a loops which converges an environment on what is written in the file. It thinks of an app instance as being on a particular version (old or new), started or stopped (up or down), and in or out of the load balancer pool, and knows which transitions are allowed, eg:
```
  (old, up, in) -> (old, up, out) - ok
  (old, up, out) -> (old, up, in) - no! don't put the old version in the pool!

  (old, up, out) -> (old, down, out) - ok
  (old, up, in) -> (old, down, in) - no! don't kill an app that's in the pool!

  (old, down, out) -> (new, down, out) - ok
  (old, up, out) -> (new, up, out) - no! don't upgrade an app while it's running!
```
Based on those rules, it plans a series of transitions from the current state to the desired state. You can model state space as a cube, where the three axes of space correspond to the three aspects of the state, vertices are states, and edges are transitions, some allowed, some not. Planning the transitions is then route-finding across the cube. When i realised this, i made a little origami cube to illustrate it, and started waving it at everyone. My colleagues thought i'd gone mad.
You need one non-cubic rule: there must be at least one instance in the load balancer at any time. In practice, you can just run the loop against each instance serially, so that you only ever bring down one at a time.
This process is safe, because if the tool dies, it can just start the loop again, look at the current state, and plan again. It's also safe to run at any time - if the environment is in the desired state, it's a no-op, and if it isn't, it gets repaired.
To upgrade an environment, you just change what's in the file, and run the loop.
by underyx on 1/17/18, 10:43 AM
We wrote https://github.com/kiwicom/crane which posts and updates a nicely formatted Slack message with the status of releases. It also posts release events to Datadog (in a version we're publishing soon) and to an API that records them in a Postgres DB we keep for analytics queries.
by drdrey on 1/17/18, 2:46 PM
https://www.spinnaker.io/
Full disclosure: I'm on the Spinnaker team
by mickeyben on 1/17/18, 10:27 AM
What do you mean by keep track? Do you want to be aware of deployments?
A Slack notification could do it. Or do you want to correlate deployments with other metrics?
In this case we instrument our deployments into our monitoring stack (influxdb/grafana) and use this as annotations for the rest of our monitoring.
We can also graph the number of releases per project on different aggregates.
by geocar on 1/17/18, 11:14 AM
Service discovery contains all the versions and who should be directed at what.
We also store stats in the service discovery app so versions can be promoted to "production" for a customer once the account management team has reviewed and updated their internal training.
by _drFaust on 1/17/18, 8:44 PM
Got about 80+ services. One repo per service, each service has it's own kubernetes yaml that details the services deploys to the cluster. K8s has a huge ecosystem for monitoring, versioning, health, autoscaling and discovery. On top of that, each repo has a separate slack channel that receives notifications for repo changes, comments, deployments, container builds, datadog monitoring events, etc. There are also core maintainers per repo to maintain consistency.
For anyone that has begun the microservice journey, kubernetes can be intimidating but way worth it. Our original microservice infrastructure was rolled way before k8s and it's just night and day to work with now, the kubernetes team has thought of just about every edge case.
by discordianfish on 1/17/18, 11:21 AM
Keep track as having a version controlled state of all revisions/versions deployed? That's something I would be interested in solutions to too, especially in a kubernetes environment with CI.
I could probably snapshot the kubernetes state to have an trail I can use to rollback to a point in time. Alternatively I thought about having CI updatemanifests in an integration repo and deploy from there, so that every change to the cluster is reflected by a commit in this repository.
by chillydawg on 1/17/18, 2:00 PM
We built a small internal service that receives updates from the build & deployment scripts we run which then presents us with a html page that shows what branch & commit of everything is deployed (along with the branch and commit of every dependency) where, when and by who. It's totally insecure so it can be trivially spoofed, but it's our V1 for our fleet of golang services and it works well.
by ukoki on 1/17/18, 12:32 PM
Have a CI/CD pipeline that does the following:
- unit tests each service
- all services fan-in to a job that builds a giant tar file of source/code artefacts. This includes a metadata file that lists service versions or commit hashes
- this "candidate release" is deployed to a staging environment for automated system/acceptance testing
- it is then optionally deployed to prod once the acceptance tests have passed
by char_pointer on 1/17/18, 7:05 PM
https://github.com/ankyra/escape (disclaimer: I'm one of the authors)
We use Escape to version and deploy our microservices across environments and even relate it to the underlying infrastructure code so we can deploy our whole platform as a single unit if needs be.
by nhumrich on 1/17/18, 2:08 PM
We use gitlab CI for pipelines which is great. You can figure out when everything was deployed last etc. We even built our own dashboard using gitlab api that shows all the latest deploys, just so its easier to track down what was recently deployed if we are investigating issues.
by ecesena on 1/19/18, 12:41 AM
Maybe I'm misunderstanding the question, but you may want to have a look at Envoy: https://www.envoyproxy.io
by invisible on 1/17/18, 12:51 PM
We use Jenkins for releases, kubernetes for deployments if I understand the question correctly. We’d like to use something like linkerd to simplify finding dependencies.
by lfalcao on 1/18/18, 3:32 AM
https://github.com/zendesk/samson
by brango on 1/17/18, 3:05 PM
Master=stable and in prod, non-master branches=dev & staging. Jenkins deploys automatically on git commits.
by hguhghuff on 1/17/18, 10:52 AM
In what technical environment? More info needed.