from Hacker News

Why are we templating YAML? (2019)

by olestr on 1/23/24, 10:58 AM with 654 comments

by bilalq on 1/23/24, 4:39 PM
I'm completely done with configs written in YAML. Easily the worst part of Github Actions, even worse than the reliability. When I see some cool tool require a YAML file for config, I immediately get hit with a wave of apprehension. These same feelings extend to other proprietary config languages like HCL for Terraform, ASL for AWS Step Functions, etc. It's fine that you want a declarative API, but let me generate my declaration programatically.
Config declared in and generated by code has been a superior experience. It's one of the things that AWS CDK got absolutely right. My config and declarative definition of my cloud infra is all written in a typesafe language with great IDE support without the need for random plugins that some rando wrote and never updated since 2 years ago.
by Draiken on 1/23/24, 11:52 AM
I agree that YAML templating is kind of insane, but I will never understand why we don't stop using fake languages and simply use a real language.
If you need complex logic, use a programming language and generate the YAML/JSON/whatever with it. There you go. Fixed it for you.
Ruby, Python, or any other language really (I only favor scripting ones because they're generally easier to run), will give you all of that without some weird pseudo-language like Jsonnet or Go templates.
Write the freaking code already and you'll get bitten way less by obscure weird issues that these template engines have.
Seriously, use any real programing language and it'll be WAY better.
by TheFuzzball on 1/23/24, 11:32 AM
I just knew this would be about Kubernetes when I saw the title.
The Kubernetes API is fairly straightforward, and has a well-defined (JSON) schema, people should be spending a bulk of their time learning k8s understanding how to use the API, but instead they spend it working out how to use a Helm chart.
I don't think Jsonnet, Ksonnet, Nu, or CUE ever gained that much traction. I'm convinced most people just use Kustomize, because it's fairly straightforward and built in to kubectl.
I'd like a tool that:
- Gives definition writers type checking against the k8s schemas - validation, version deprecations, etc.
- Gives users a single artefact that can be inspected easily and will fail (ACID) if deployed against a cluster that doesn't support any objects/versions.
- Is built into the default toolchain
---
I feel like writing a Bun or Deno TypeScript script that exports a function with arguments and returns a list of definitions would work well, esp. with `deno compile`, etc. but that violates the third point.
by valty on 1/23/24, 12:45 PM
It's funny how little developers think about how to do configuration right.
It's just a bunch of keys and values, stored in some file, or generated by some code.
But its actually the whole ball game. It's what programming is.
Everything is configuration. Every function parameter is a kind of configuration. And all the configuration in external files inevitably ends up as a function parameter in some way.
The problem is the plain-text representation of code.
Declarative configuration files seem nice because you can see everything in one place.
If you do your configuration programmatically, it is hard to find the correct place to change something.
If our code ran in real-time to show us a representation of the final configuration, and we could trace how each final configuration value was generated, then it wouldn't be a problem.
But no systems are designed with this capability, even though it is quite trivial to do. Configuration is always an after-thought.
Now extend this concept to all of programming. Imagine being able to see every piece of code that depends upon a single configuration value, and any transformations of it.
Also, most configuration is probably better placed into a central database because it is relational/graph-like. Different configuration values relate to one another. So we should be looking at configuration in a database/graph editor.
Once you unchain yourself from plain-text, things start to become a lot simpler...of course the language capabilities I mentioned above still need to become a thing.
by ManBeardPc on 1/23/24, 11:20 AM
Worse yet, in some places (CI/CD) YAML becomes nearly a programming language. A very verbose, unintuitive, badly specified and vendor-specific one as well.
by jrockway on 1/23/24, 5:52 PM
Yeah, I'm very sad that helm won. We do OSS k8s stuff at work, and 100% of users have asked for us to make a helm chart. So we had to. It is miserable to work on; your editor can't help you because the files are named like "foo.yaml" but they aren't YAML. You have to make sure you pipe all your data through "indent 4" so that things are lined up correctly in the YAML. What depresses me the most is that you have to re-expose every Kubernetes feature in your own way. Someone wants to add deployment.spec.template.spec.fooBars? Now you have to add deploymentFooBars to your values.yaml file and plumb it in. For every. single. feature.
It's truly "worse is better" gone wrong. I have definitely done some terrible things like "sed -e s/$FOO/foo/g" to implement templating... and that's probably how Helm started. The result is a mess.
I personally grew up on Kustomize before it was in kubectl, and was always exceedingly happy with it. (OK, it has a lot of quirks. But at least it saves you time because it actually understands the semantics of the objects you are creating.)
I like Jsonnet a lot better. As part of our k8s app, we ship an Envoy deployment to do all of our crazy traffic routing (basically... maintaining backwards compatibility with old releases). Envoy configs are... verbose..., but Jsonnet makes it really easy to work on. (The code in question: https://github.com/pachyderm/pachyderm/blob/master/etc/gener...)
I'm seriously considering transpiling jsonnet to the Go template language and just implementing everything with Jsonnet. At least that is slightly maintainable, and nobody will ever know because "helm install" will Just Work ;)
But yeah, I think Helm will be the death of Kubernetes. Some competing computer allocator container runner thingie will have some decent language for configuration, and it will just take over overnight. Mark my words!
by roenxi on 1/23/24, 11:26 AM
I see a problem here. I'm not certain if the sort of person who would choose YAML as their configuration language sees a problem here.
There is a direct conflict between human-centred data representations and computer-centred. Computers love things that look like a bit like a Lisp. Humans like things that look a bit like Python. If you're the sort of person who wants to use a computer to manipulate their Kubernetes config then you'd be secretly annoyed that Kubernetes uses YAML. However, it appears the Kubernetes community are mainly YAML people, so why would they mind that their config files will be horrible to work with once programming logic gets involved? The downside of YAML is exactly this scenario, and I believe the people involved in K8s are generally cluey enough to see that coming.
> YAML is a superset of JSON
The spec writers can put whatever they want in their document, but I don't think this is true. If you go in and convert all the YAML config to JSON, the DevOps team is going to get upset. The two data formats have the same semantic representation, but so do all languages compiled to the same CPU arch. JSON and YAML are disjoint in practice. Mixing the two isn't a good idea.
by neallindsay on 1/23/24, 12:23 PM
My personal philosophy is that string interpolation should not be used to generate machine-readable code, and template languages are just fancy string interpolation. We've all seen the consequences of SQL injection and cross-site scripting. That's the kind of thing that will keep happening as long as we keep putting arbitrary text into interpreters.
Yes, this means I don't think we should use template files to make HTML at all.
Alternatives to using template languages for HTML include Haml (for Ruby) and Pug (for JavaScript). These languages have defined ways to specify entire trees of tags, attributes, and text nodes.
If you don't like Python-style significant indentation, JavaScript has JSX. The HTML-looking parts of JSX compile down to a bunch of `createElement` expressions that create a web document tree. That tree can then be output as HTML if necessary.
Haml, Pug, and JSX are not template languages even though they can output HTML. Likewise, `JSON.stringify(myObj)` is not a template language for JSON. Generating machine-readable code should be done with a tool that understands and leverages the known structure of the target language when possible.
by gregwebs on 1/23/24, 11:39 AM
We are switching to cuelang [1]. IMHO it is better designed than Jsonette. Since Kubeenetes already has state reconciliation, the only thing missing in this setup is deletion. But that can now be accomplished with the prune feature. [2]
[1] https://cuelang.org/docs/integrations/k8s/
[2] https://kubernetes.io/blog/2023/05/09/introducing-kubectl-ap...
by BiteCode_dev on 1/23/24, 11:41 AM
This is where I usually pitch in with "Have your heard of CUELang, our lord and savior?": https://cuelang.org/
- Not turing complete yet sufficiently expressive to DRY
- Define schema and data with the same language, in a separate or same file. With union types.
- Generate YAML or JSON. Can validate itself, or a YAML or JSON file.
The biggest drawback being the only implementation is currently in go, meaning you may have to subprocess of ffi.
by gtirloni on 1/23/24, 12:26 PM
I love YAML and I curse it every single day that I'm working with Helm charts.
People ask me what I'd use to deploy apps on Kubernetes and I say I hate Helm and would still use it for a single reason: everybody is using it, I don't want to create a snowflake infrastructure that only I understand.
Still, back in the day I thought jsonnet would win this battle but here we are, cursing Helm and templates. That's the power of upstream decisions.
by resters on 1/23/24, 12:28 PM
In my view, the presence of YAML templating is a red flag in any codebase or system.
YAML got its popularity with the advent of Ruby on Rails, largely due to the simplicity of the database.yml file as an aid in database connection string abstraction that felt extremely clean to Java programmers who were used to complicated XML files full of DSN names and connection string peculiarities.
The evolution of the database.yml file into something arguably as complex as the thing it was intended to replace is described in the article below:
https://dev.to/andreimaxim/the-rails-databaseyml-file-4dm9
by Joker_vD on 1/23/24, 11:33 AM
The title of TFA was actually my reaction when I learned what Helm was actually doing. Initially I thought Helm would take an input file of YAML-with-template-bits, parse that YAML as an object, then use the provided template bits to fill in the parts of that object, then serialize the object back to YAML and write it out. Sounds reasonable, right? Nope, it's literal text substitution, so if you want to have a valid YAML as the output you better count your indentation on your fingers, and track where the newlines go or don't go.
by nhumrich on 1/23/24, 12:56 PM
I will tell you exactly why we template yaml. Its the exact same reason every code base has ugly parts. And that's the evolution of complexity.
At first, you have a yaml file. No templates, no variables. Just a good old standard yaml. Then, suddenly you need to introduce a single variable. Templating out the one variable is pretty easy, so you do it, and it's still mostly for humans to edit.
Well, now you have a yaml file and template engine already in place. So when one more thing pops up, you template it out.
8 features later, you wonder what you've done. Only, if we go back in time, each step was actually the most efficient. Introducing anything else at step 1 would be over-engineering. Introducing it anywhere else would lead to a large refactor and possible regressions.
To top it off, this is not business logic. Your devs are not touching this yaml all that much. So is it worth "fixing", probably not.
by linsomniac on 1/23/24, 3:24 PM
Ansible convinced me that doing programming tasks in YAML is insanity, so I started an experiment: What would Ansible be like if it's syntax were more like Python than YAML. https://github.com/linsomniac/uplaybook
I spent around 3 months over the holidays exploring that by implementing a "micro Ansible", I have a pretty solid tool that implements it, but haven't had much "seat time" with it: working on it rather than in it. But what I've done has convinced me that there are some benefits.
by difosfor on 1/23/24, 11:36 AM
Wouldn't it be a better idea to use an existing programming language instead of cooking up numerous half baked templating languages?
by globallyunique on 1/23/24, 4:13 PM
We wrote a backend service at Lyft in Python and at some point needed to do some string interpolation for experimentation. In a rush someone implemented this in YAML (no new deps needed). This ended up being the bane of the teams existence. Almost impossible to test if something was going to break in runtime, could only verify it was valid yaml but many other things were infeasible, super hard to debug - it soured me on YAML for years.
by carlosrdrz on 1/23/24, 12:04 PM
Can someone help me understand what is the advantage of using jsonnet, cue, or something else vs a simple python script (or dialect, like starlark), when you have the need of dynamically creating some sort of config?
I've used jsonnet in the past to create k8s files, but I don't work in that space anymore. I don't remember it being better or easier than writing a python script that outputs JSON. Not even taking into account maintainability and such. Maybe I'm missing something?
by embik on 1/23/24, 11:19 AM
I am really sad that jsonnet / ksonnet never really took off. It’s a great way to template, but has a bit of a learning curve in my experience. I suspect that is why it’s niche.
If you like what is presented in this article, take a look at Grafana Tanka (https://tanka.dev).
by Thev00d00 on 1/23/24, 11:22 AM
In my experience there is a near zero uptake of jsonnet or similar amongst "regular" i.e less ops inclined developers.
gotmpl is a lot easier to grok if you are coming in cold. Yes it sucks for anything mildly complex, but the barrier to entry is significantly lower.
Generation via real programming languages is the future I am hoping for.
by datadeft on 1/23/24, 11:29 AM
Indeed why? However the conclusion I have is not to use JSON but to use a type safe configuration language that can express my intent much better making illegal states impossible. One example of such lang is Dhall.
https://dhall-lang.org/
by tovej on 1/23/24, 11:45 AM
We're talking about templating and generating files, but it seems like everyone has just collectively forgotten about M4?
Yes, it can be unsafe if you're not careful, but if you need to bang out a quick prototype it's the best tool there is. It's part of POSIX, and so it will always be available, the language is dead simple, and you can generate any text you want with it.
I wouldn't use it with YAML, but I would probably never template YAML in the first case: just generate JSON and feed it through `yq -y` if you need a quick YAML generator.
by kungfufrog on 1/23/24, 12:42 PM
There's 2 things on the horizon here for Kubernetes that give me hope. KCL, its own configuration language, and Timoni, which builds off CUE and corrects some of the shortcomings of Helm.
Though these days, OLM and the Quarkus operator SDK give you a completely viable alternative approach to Helm that enables you to express much more complex functionality and dependency relationships over the lifecycle of resources. An example would be doing a DB backup before upgrading to a new release etc. Obviously this power comes at a cost.
by m000 on 1/23/24, 5:40 PM
Yes, templating YAML is crazy. But is the answer jsonnet? That's even more batshit.
Why hasn't anyone opted for a "patch-based" approach? I.e. start with a base YAML/JSON file, apply a second file over it, apply this third one, and use the result as the config. How you generate these files is entirely up to you.
by somewhereoutth on 1/23/24, 11:48 AM
Meanwhile in JavaScript land, config is simply another js file, with all the Object and Array literal goodness that that gets us, and the full language environment backing it up.
by skywal_l on 1/23/24, 11:50 AM
For those who do not know it yet, the now classic noyaml site: https://noyaml.com/
by EdwardDiego on 1/23/24, 1:19 PM
If you're using Helm to deploy your own apps, I feel that's a code smell. I'll add jsonnet for your own apps to the that list.
Just use dumb YAML, maybe kustomize if you really need, but if that's not sufficient, consider that a sign that you're not carving the wood the way it's telling you to.
Any form of templating for creating your own application manifest is another moving part that allows for new and fun errors, and the further away your source manifest is from the deployed result, the harder it is to debug.
If you really want to append a certain set of annotations to each and every pod in a cluster, instead of using shared templates (and enforcing their usage), there's other approaches in K8s for these kinds of use cases, that you have a lot more control over.
by seadan83 on 1/23/24, 6:13 PM
I'm confused on two points:
(A) why not use the yaml syntax that is not whitespace sensitive. In the authors example, that could be: {name: Al, address: something}
(B) do env variables not go a long way to avoiding the need for a template? Instead of generating a complete YAML, put env variable placeholders in and set those values in the target environment. At this rate, the same YAML can generally be deployed anywhere. I've seen that style implementated several times, works pretty well.
I do agree that generating config itself, and not just interpolating values - is potentially really gnarly. I do wonder, instead of interpolating variables at deploy time, why not use env variables and do the interpolation at runtime?
by throwaway143829 on 1/23/24, 11:18 AM
This article made me think it'd be nice to generate k8s JSON using TypeScript. Just a node script that runs console.log(JSON.stringify(config)), and you pipe that to a yaml file in your deploy script. The syntax seems more sane and has more broad appeal than jsonnet, and I'd wager that the dev tooling would be better given good enough typings.
By the way the answer to the question "why are we templating yaml?" is: people are just more familiar with it and don't want to have to translate examples to jsonnet that they copy and paste from the web. Do not underestimate this downside :) Same downside would probably apply to TypeScript-generated configs I bet.
by b33j0r on 1/23/24, 6:35 PM
Separate generated content from maintained content. Works for me. But on the specifics here, from a very python POV.
Strict YAML is easier to maintain than json if you have deeper than one or maybe two levels of nesting, multiline strings, or comments.
So, I build my config systems to _generate_ YAML instead of “templating YAML.”
PyYAML extensions and ruamel.yaml exist—Though kind of out of date, and more new projects are using TOML. (From project description: “ruamel.yaml is a YAML parser/emitter that supports roundtrip comment preservation”)
Confession: but yeah, not when I use ansible. Ansible double-dog-dares you to “jinja2 all the things” without much in the way of structured semantics.
by sdflhasjd on 1/23/24, 11:31 AM
I need a restraining order against YAML DSLs.
by hintymad on 1/23/24, 5:57 PM
I can't remember how many times I heard or saw the argument "but that is in YAML", which implies that the configuration(or god forbid, the code) is simple and well designed. I find it hilarious.
And a worst contender is embedding text template like jinja in a YAML config and forcing everyone to use such abomination to change production config via deployment. Yes, I'm talking about Terraform or the like. Why people think this kind of design is acceptable is beyond my comprehension.
by tipiirai on 1/23/24, 11:24 AM
I think YAML is a good pick for non-developers / content creators. The front matter section in Markdown files is a good example. Or is there a better, human-friendly alternative?
by larve on 1/23/24, 11:54 AM
Serendipity strikes as I'm implementing an emrichen interpreter in golang after getting too annoyed about templating YAML as a string.
The reasons I like yaml is that I can see the tree structure directly, and to my lisp brain it is extremely easy to read. Furthermore, in our age of LLMs, I find LLMs to be able to generate "correct" YAML more easily than JSON, since the tree depth is encoded in every line, and doesn't require matching larger structures. It also uses an actually significant amount less tokens.
I find it extremely easy to have LLMs generate decent DSLs by asking them to use a YAML output format, and found it very robust to generate code out of these (or generate an interpreter for the newly created DSL).
I didn't know about !tags until sunday, which is quite shameful, but I find that the emrichen solution is actually quite elegant, and really kind of feels like a lisp macro expander.
Overall, YAML is just good enough for me to get shit done, I can read and skim it quickly, LLMs do well with it, and it's easy to work with. It has aliases, multiline strings and some other Quality of life features built in.
https://github.com/con2/emrichen
by rho4 on 1/23/24, 3:29 PM
The cycle seems: Invent a new static information format (XML/JSON/HTML/...), reduce verbosity, add GUI, variables, comments, expressions, control flow, validation, transformation, static typing, compilers, IDE support, dependency management, and maybe a non-backwards-compatible major version etc. And you end up with yet another Java/C# clone, just inferior because it was never meant to support all these things.
by break_the_bank on 1/23/24, 12:18 PM
Obviously biased but we at Kurtosis are trying to solve this problem through Starlark.
We took Starlark added a few more of our instructions that make our Starlark container native. The complex Starlark definition supports
- Composition - you can import a remote definition and just use it - Decomposable - you can break things apart - Parametrizability - want one of a service and 10 of the other, just pass an argument - Portable - It runs pretty much anywhere
Our runtime takes the Starlark and creates environments in both Docker and Kubernetes; from one definition
Our CTO wrote this - https://docs.kurtosis.com/advanced-concepts/why-kurtosis-sta...
We are source available https://github.com/kurtosis-tech/kurtosis
Here is a popular environment definition - https://github.com/kurtosis-tech/ethereum-package that protocol developers use to setup custom Ethereum test environments
by marttilaine on 1/23/24, 2:05 PM
For anyone struggling with Helm YAML syntax errors in their day job, I shamelessly advertise my browser-based debug tool Helm Playground:
https://helm-playground.com/ - https://github.com/shipmight/helm-playground
by vitiral on 1/23/24, 8:43 PM
I'm designing a simple dev environment from scratch.
My solution for this is a sandboxed lua for programatic configuration:
https://github.com/civboot/civlua/tree/main/lib/luck
I can't stand JSON (for many reasons) so I created a serialization format that combines it and CSV for nested objects
https://github.com/civboot/civlua/tree/main/lib/tso
I wish the industry would standardize on a solution like this. IMO you shouldn't use a "real" language unless you can lock it down to be determinisitic. JSON is supposed to be human readable but fails for lots of real-world data like multi-line strings or lists of records.
CSV is more readable but doesn't supported nested objects.
by JohnMakin on 1/23/24, 5:33 PM
I wrote a monstrosity of a terraform module that takes pre-existing helm charts/templates, feeds some json into them via terraform, translates the results to HCL, and deploys them.
It's kind of a rube-goldberg machine that I made as a bespoke solution to a weird problem but it's been fairly pleasant to work with so far.
by willhbr on 1/23/24, 11:23 PM
I actually wrote about my thoughts on the tradeoffs between purely-config and purely-code:
https://willhbr.net/2024/01/18/the-code-config-continuum/
by 5Qn8mNbc2FNCiVV on 1/23/24, 11:14 PM
The worst part about this mess is that it's all fine and dandy with any other tool as long as it's first party but as soon as you want to use another software you're stuck reading and understanding the spaghetti that Helm inevitably becomes and hope that you can configure it the way you need to.
It's come to the point that I don't even think about using most charts and just build them myself. The issue with that is that software like Prometheus, Loki, Grafana or f.e. Postgres operators are so complex that it's almost impossible do "fix" them.
This really is like that dog sitting in a burning house and saying "This is fine" because that's how most go on with their day after hitting their head for a few hours and running dozens of pipelines.
by 20after4 on 1/24/24, 12:15 PM
I'm no fan of YAML but for an example of templating YAML that is tolerable, take a look at esphome[1] (and I suppose also home assistant).
In your main yaml file you have something like this:
```
  packages:
    left_garage_door: !include
      file: garage-door.yaml
      vars:
        door_name: Left
        door_location: left
        open_switch_gpio: 25
        close_switch_gpio: 26
```
Then in garage-door.yaml you can reference the vars directly with ${door_name} syntax.
It's the best version of templating YAML that I have experienced.
1. https://esphome.io/guides/configuration-types#packages-as-te...
by im3w1l on 1/23/24, 4:13 PM
From my vantage point it seems to have happened roughly like this
First we had arbitrary code that did something. Then we thought, "hey wouldn't it be nice if we could do this declaratively in a standard way using configuration instead?" Then we could reason about it more easily. But then came the realization "this declarative system isn't quite powerful enough, what if we could sprinkle some logic on top of it". "Hey wouldn't it be nice if we could go back to doing it declaratively? I guess we can just add the missing features to our not-turing-complete config language". "Wow now it can almost do everything I want.... but there is just this tiny little thing I want to do in addition, let's do templating!" etc
by solatic on 1/23/24, 5:16 PM
YAML is fine for human-maintained configuration. Yeah it has its footguns (like Norway) but if you're actually writing human-maintained configuration then you quickly pick these up with practice and they turn into a non-issue.
If your configuration is complicated enough that it needs to be generated, then use a real general-purpose language to generate it. Not crummy pseudo-imperative constructs bolted onto the YAML.
At the very least, systems that take YAML as configuration should also take JSON as configuration. GitOps-style systems should allow you to define, not just system.yaml config, not just system.json config, but also system.js config, that is evaluated in some kind of heavily-restricted sandbox.
by killthebuddha on 1/23/24, 6:31 PM
I would love for someone to eviscerate the following idea:
Every deployed process receives exactly 3 environment variables, NONCE, TAGS and CONFIG_DB_PARAMS. Every process is bootstrapped in the same way:
```
  1. Initialize config db client.
  2. config = db.fetchConfig(tags)
  3. Use NONCE to signal config changes if needed.
```
Of course there's some environments where this Just Won't Work. I'm wondering if there are some very serious issues with this approach in a "standard" web application environment. It seems so straightforward but I've literally never seen it done before, so I feel like I'm missing something.
by brodouevencode on 1/23/24, 2:52 PM
It's fashionable to hate YAML. And sometimes rightly so. But what are the alternatives? JSON, XML, INI, TOML, Dahl, Cue, JSONNET, HCL, your programming language of choice. Also let's agree on the target use case - in that YAML is largely used for configuration and operational tasks. If I were to rank-order the features necessary in a good configuration language they would be 1) readability 2) data/schema validation 3) stackability/composability 4) language support 5) editor support 6) industry adoption. So let's do a comparison:
YAML is fairly easy to read, has schema validation with the right library, and is pretty ubiquitous. It can get unwieldy like JSON though.
XML is big, ugly, unreadable. No one likes XML despite it's robust schema validation capabilities.
Your programming language of choice doesn't work because of the target use case unless you truly are a build-run group.
INI it too simplistic for many environments.
HCL is included because I'm a bit of a Terraform fanboy and it has great features like validation, readability and composability. However you're not going to find it in the wild as a general purpose configuration language - outside of Terraform it just hasn't taken hold.
Does anyone really use Dahll? (Serious question.)
JSON is nice because everyone understands JSON. JSON is not nice because all the brackets, braces, quotes, etc get in the way and make sufficiently large configurations hard to read. With the right library you can get schema validation.
JSONNET suffers from the same problems that JSON does, but adds more operations which makes sufficiently large things very hard to read.
TOML is nice and reminds me of INI in it's simplicity.
Cue looks and smells like JSON, has schema validation, but is much more readable.
If I were to rank-order these options it would be 1) Cue 2) TOML 3) YAML 4) JSON 5) your programming language of choice 6) JSONNET 7) INI 8) HCL 9) XML 10) Dahll (maybe?). My point here is that while YAML has a lot to be desired it's still very useful for most implementations and is better than many of the alternatives.
by imwillofficial on 1/23/24, 1:38 PM
You guys are missing the big reason.
Most devops folks are former sysadmins with no developer experience.
The tools needed to be able to be picked up quickly without too much pain.
This is why our devops tooling in popular usage is not as robust as developer tools.
Know your audience.
by drybjed on 1/23/24, 4:27 PM
YAML, TOML and JSON can be ingested to represent the same data structures internally, it's just a few lines of code to decide which load() function should we use for a particular file. Why not support all three formats in your applications for configuration and just let users decide, which one they want to use? Put a 'config.json' in '/etc/app/conf.d/' and you get the same data, as with 'config.yml' or 'config.toml'. Then users can use whichever format they prefer for the input data.
by taspeotis on 1/23/24, 11:36 AM
Yelling At My Laptop
by hosh on 1/23/24, 4:30 PM
This article is not convincing me about yaml or json templating.
The only thing I can think of is that generating these files requires picking a language platform. I chose Ruby to generate the k8s manifests I need.
If you are picking something that is meant to be language-agnostic, or to have very little ramp-up time, then sure, templating. It just comes at a cost where the templating language itself approaches that of a full blown, Turing-complete language as more features gets added, often with shaky foundations (such as HCL).
by wodenokoto on 1/23/24, 3:32 PM
Like many other things, Azure services can be deployed using JSON. Of course it’s not just json, it’s an entire language of deployment definitions and templating language hidden within json markup.
But next to that, Microsoft came out with bicep, which is a domain specific language for defining resources. It comes with a full language server and is honestly quite nice to use (if only azure services had some sort of reasonable logic to them)
I think k8 and friends need their own biceps.
by throwawaaarrgh on 1/23/24, 3:48 PM
If you can write Python, Perl, Ruby, etc. Hell even yq in the shell. Then you have a full programming language that can output YAML or JSON in any way you want. No weird DSL, no twisting yourself into knots.
Just write normal code, make any data structure, print it as any data format. Call the code, output to temp file, use file, delete file.
Is it clunky? Yes. But it works, and you can't get any simpler.
Keep it simple, silly.
by anon-3988 on 1/23/24, 1:53 PM
I have no idea why people are so against code generation. If I want hundred different kinds of YAML or code, I just fully generate them.
by Hendrikto on 1/23/24, 11:40 AM
Why do we need a special purpose language for this? Just for the slightly nicer syntax? Is that worth learning a completely new language?
by tristenharr on 1/23/24, 4:35 PM
Clearly lots of people have tried to replace YAML with something else and that hasn’t worked. What’s your wishlist on making YAML actually work for declarative systems? Or can it work?
Would things like..
A great LSP with semantic autocomplete, Native cross-file imports, and conditionals based on environment make things feel different/better?
What should modern declarative systems be doing in your opinion?
by gtirloni on 1/23/24, 4:31 PM
I think we're barking at the wrong tree here (speaking about Kubernetes workloads).
This ugly mess of low-ish level details will never go away. Developers that are trying to focus on developing apps will never enjoy these things and that's fine.
Something like score.dev which abstracts things even further seems to be the way to go as the interface that is exposed to developers.
by transfire on 1/23/24, 12:56 PM
Is Kubernetes using YAML 1.1 still? Because some of the complaints I hear shouldn’t be an issue with 1.2.
Moreover the YAML spec allows for specifying the tags recognized per application.
So on two counts, if “on”, “true”, “false”, as well as “yes”, “no”, “y”, “n”, “off”, and all capitalized and uppercase variants, are all boolean literals, it is not YAML’s fault.
by PaulHoule on 1/23/24, 12:24 PM
You’re going to hate absolutely everything involved in configuring Kubernetes because you’re configuring Kubernetes.
by jaxxstorm on 1/23/24, 3:04 PM
Author of the post here.
It's always been relatively shocking to me that this is still relevant 4 years after I wrote it. Helm is as Ubiquitous as ever, despite attempts to replace it with Jsonnet, Cue and programming languages.
I've personally moved on from Jsonnet and would recommend Pulumi to anyone experiencing this problem.
by pyuser583 on 1/23/24, 4:34 PM
Yeah I knew that answer would be Helm Charts.
I love Helm. Helm is magic. Helm makes my life tolerable.
But I’ve templates way too many yaml files.
If it were up to me XML would be default and we would use XSLT for templates.
I’m pretty sure you can use XSLT on JSON, which is logical identical to YAML. So something could be worked out.
Maybe that should be my next open source project.
by alkonaut on 1/23/24, 5:07 PM
Two things need to die and die quickly:
1) “blind configuration” Such as a CI pipeline config that you commit but can’t validate before you do.
2) Complex markup for config. Anything that can’t fit into a single screen of flat .toml or .ini style config should be code.
by reissbaker on 1/23/24, 12:50 PM
I think Steve Yegge got it right when he wrote:
I know, I know — everyone raves about the power of separating your code and your data . . . But it's [not] what you really want, or all the creepy half-languages wouldn't all evolve towards being Turing-complete, would they?
https://sites.google.com/site/steveyegge2/the-emacs-problem
Templating YAML is the same, but... Honestly, Jsonnet is too. If you're going to generate JSON — by god use a normal programming language. You have to teach your team one fewer thing (approximately no one knows Jsonnet); it already integrates with your existing build system; if you wrote a useful util function in your main codebase you can reuse it; if you have a typechecker or a linter you can use it; etc etc.
by hellodanylo on 1/23/24, 4:08 PM
If need anything more complicated than simple $var substitution, it's time to use a general purpose scripting language with appropriate libraries to generate your data structure. A half-baked template DSL will never work.
by hiAndrewQuinn on 1/23/24, 11:49 AM
In my case, I'm templating YAML because the Obsidian Templater plugin can read YAML frontmatter it asks you for and then fill in a Markdown file with the Mad Libs you choose to populate it with.
I understand this is a niche use case.
by osigurdson on 1/23/24, 2:36 PM
Perhaps we need something along the lines of an infrastructure description language. Some of these yamls get pretty long. Using a real language (like Python) is probably not constrained enough however.
by andix on 1/23/24, 12:59 PM
A lot of people don't know that JSON is valid YAML. You can mix JSON into your YAML wherever you need it.
So if you have issues expressing something complex in YAML (like nested arrays) just write JSON instead.
by Alifatisk on 1/23/24, 11:29 AM
If you start templating yaml then you might want to give jsonnet a try
by pachico on 1/23/24, 11:59 AM
I keep coming back to tanka (https://tanka.dev/) and hoping it had more traction in the industry.
by HeckFeck on 1/23/24, 12:30 PM
I'd like to quote a snarky sig I once read on Slashdot:
> You know what's good about YAML? No, me neither.
Personally, I love using a Perl hash for my config files. But maybe I'm just mad.
by davedx on 1/23/24, 12:30 PM
Yeass. I encountered jsonnet thanks to Ory Kratos, it’s great. Yaml is an awful hack that’s only still around in devops because of chance and circumstance. I hate IT.
by imajoredinecon on 1/23/24, 4:52 PM
It’s funny that the post ends up recommending Jsonnet, which is a very close relative of the Google-internal “GCL,” which everyone at Google hates
Pick your poison, I guess.
by schappim on 1/23/24, 5:16 PM
The sidekick configuration file is a great example of when you’d want to template YAML.
In that instance the queue name is derived dynamically from the hostname.
by danfritz on 1/23/24, 2:45 PM
Give it a couple of months: Why the f*ck are we templating JSON
This article feels very ironic to me hence the sarcasm.
Give me code as configuration and I'll be interested
by neilwilson on 1/23/24, 1:08 PM
Obligatory reference to Rubynetes - using Ruby and rspec to generate the YAML files that k8s loves.
[0]: https://www.brightbox.com/blog/2020/02/24/rubynetes-getting-...
[1]: https://www.brightbox.com/blog/2020/02/12/rubynetes-kubernet...
[2]: https://www.brightbox.com/blog/2020/02/17/using-openapi-to-v...
by Sharlin on 1/23/24, 12:13 PM
I guess templating your configuration is just an instance of the old wisdom that any problem is solvable by adding a layer of indirection.
by oalders on 1/23/24, 8:09 PM
YAMLScript now exists: https://yamlscript.org/
by boringuser2 on 1/23/24, 6:12 PM
Config-as-code is this strange nightmare made for non- programmers to trick them into writing extractable formal logic.
by elicox on 1/23/24, 11:21 AM
For me, environmental variables are super simple and remove all the complexity of these configuration files.
by booleandilemma on 1/23/24, 1:21 PM
There are people out there who, for various reasons, like to make things more complex than they have to be.
by tonnydourado on 1/23/24, 12:41 PM
Helm becoming a de facto tool is up there with null in the top biggest mistakes our industry committed.
by Ensorceled on 1/23/24, 12:54 PM
I feel like I wrote something on usenet with a similar title about imake in the early 90's ...
by rco8786 on 1/23/24, 12:29 PM
I guess I'll ask the dumb question - why not use Javascript directly instead of Jsonnette?
by ryandv on 1/23/24, 5:01 PM
I'm reminded of DHH's article, "Rails is Omakase" [0] back during the time when "convention over configuration" [1] was a common refrain, meant to avoid the proliferation of (YAML or XML) configuration files by assuming sensible defaults and pre-selecting various parts of the solution stack or architecture, instead of letting it be freely specified by the developer.
You lose a few degrees of freedom and flexibility in your implementation this way, but at the same time you also don't need to wade through pages and pages of configuration documents.
Everything is cyclical. I'm waiting for the next "omakase" offering that provides a sane low-configuration platform for building "cloud native" apps. Right now it looks like we're in an analogue of the XML hell that prompted the design philosophy of Rails and "convention over configuration."
[0] https://dhh.dk/2012/rails-is-omakase.html
[1] https://wiki.c2.com/?ConventionOverConfiguration
by brainwipe on 1/23/24, 11:58 AM
Having XSLT flashbacks. Send help.
by byyoung3 on 1/23/24, 3:15 PM
any1 want to invent jsonc with me? It will be json but with comments
by motbus3 on 1/23/24, 2:53 PM
Yaml template is horrible but jsonnet also hurts my feelings :/
by enonimal on 1/23/24, 4:50 PM
JSON not including comments has caused so much labor for the world.
by rockwotj on 1/23/24, 12:06 PM
Honestly I find all these different config languages either too much to learn or they get too unwieldy quickly.
I have arrived that I think using typescript to generate JSON as being the ultimate solution.
Easy JSON support, optional typing, you already know it, and adding reusable functions and libraries is understandable. Just prevent external node modules, and have a tool that takes a typescript file with a default export of some JSON and renders the JSON to a string on stdout.
by gsky on 1/23/24, 4:55 PM
I forgo Kubernetes because of my hate for YAML.
by kube-system on 1/23/24, 3:53 PM
Templated yaml is readable if you do it right.
by liveoneggs on 1/23/24, 12:35 PM
I, too, loved hiera and puppet.
by sigwinch28 on 1/23/24, 12:04 PM
Relevant: https://noyaml.com/
YAML and its ecosystem is full of footguns and ergonomics problems, especially when the length of the document extends beyond the height of a user's editor or viewport. Loss of context with indentation, non-compliant or unsafe parsers, and strange boolean handling to name a few.
It becomes even worse when people decide that static YAML data files should have variable substitution or control flow via templating. "Stringly-typed programming" if you will. If we all started writing JSON text templates I think a lot of people would rightly argue we should write small stdlib-only programs in Python, Typescript, or Ruby to emit this JSON instead of using templated text files. Then it becomes apparent that the YAML template isn't a static data file at all, but part of a program which emits YAML as output. We're already exposing people to basic programming if we're using YAML templates. People brew a special kind of YAML-templated devops hell using tools like Kustomize and Helm, each of which are "just YAML" but are full of idiosyncracies and tool-specific behaviour which make the use of YAML almost coincidental rather than a necessity.
Yes, sometimes people would prefer to look at YAML instead of JSON, in which case I suggest you use a YAML serialization library, or pipe output into a tool like `yq` so you can view the pretty output. In a pinch you could even output JSON and then feed it through a YAML formatter.
The Kubernetes community seems to have this penetrating "oh, it's just YAML" philosophy which means we get mediocre DSLs in "just YAML" which actually encode a lot of nuanced and unintuitive behaviour which varies from tool to tool.
Look at kyverno, for example: it uses _parentheses_ in YAML key names to change the semantics of security policies! https://kyverno.io/docs/writing-policies/validate/ . This is different to (what I think are the much better ideas of) something like kubewarden, gatekeeper, or jspolicy, which allow engineers to write their policies in anything that compiles to WASM, OPA, and Typescript/Javascript respectively.
We engineers, as a discipline, have decades of know-how building and using general purpose programming languages with type checkers, linters, packaging systems, and other tools, but we throw them all away as soon as YAML comes along. It's time to put the stringified YAML templates away and engage in the ecosystem of mature tools we already know to perform one simple task they are already good at: dumping JSON on stdout.
Let's move the control flow back into the tool and out of the YAML.
by shp0ngle on 1/23/24, 5:45 PM
now I play a bit with nix and its config language.
i like yaml now.
by mordae on 1/23/24, 1:32 PM
The real question here is, why the fuck are you not using a restricted LISP dialect, generating an S-expression using macros.
by armchairhacker on 1/23/24, 11:39 AM
To me YAML seems like the CoffeeScript of JSON, and unlike CoffeeScript I don’t understand why people are still using it.
I guess XML and JSON are too verbose. But YAML is so far in the opposite direction, we get the same surprise conversions we’ve had in Excel (https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-fr...). Why is “on” a boolean literal (of course so are “true”, “false”, as well as “yes”, “no”, “y”, “n”, “off”, and all capitalized and uppercase variants)? And people are actually using this in production software?
Then when you add templating it’s no longer readable and concise anyways. So, why? In JSON, you can add templating super easily by turning it into regular JavaScript: use global variables, functions and the like. I don’t understand how anyone could prefer YAML with an ugly templating DSL over that.
And if you really care about conciseness, there’s TOML. Are there any advantages of YAML over TOML?
by lasermike026 on 1/23/24, 11:41 AM
YAML is for human. JSON is for computers. I love YAML.
by jmbwell on 1/23/24, 3:14 PM
If only we had yet another system to solve this problem. Oh look, the author has one.
xkcd 927.
by usrbinbash on 1/23/24, 3:26 PM
The problem is very specifically the fact that YAML, as a config language, sucks.
I have no idea why people started using it. "bUt jSOn dOeSn'T HaVe cOmMenTS" ... oh gimme a break! You want a comment in JSON?
```
    {
      "//": "This is a comment explaining key1.",
      "key1": "value1",
      "//": "This is a comment explaining key2.",
      "key2": "value2"
    }
```
There. Not so hard. Writing a config parser that just ignores all keys starting with "//" is trivially easy...if it's necessary to ignore them at all that is, because most config parsers I have seen couldn't care less about unknown keys, let alone repeated keys.
So what other "reasons" were there for YAML?
Oh, the human readbility thing. Yeah. Because syntactically relevant whitespace is such a joy in a data serialization format. It's bad enough when a programming language does that (and I am saying this as someone who likes python), but a serialization format? Who thought that would make things easier?
And then of course there are other things filled with joy and happiness...like the multiple ways to write "true" and "false", because that's absolutely necessary for some reason.
"Oh but what about strict-yaml?!" I hear the apologies coming...great, so now I have the ambiguitiy of what parser is used on top of the difficulties introduced by the language itself. Amazing stuff. If that's the solution, then I'd rather not have the problem (aka. the language).
But despite all[1] these[2] problems[3] and more, YAML somehow became the goto language for configuring pretty much everything in DevOps, first in containerization, then in cloud, and everything in between. And as a result, we now have to make sure our config template parsers get whitespace right. Great.
So bottom line: The problem here is maybe 1/3 the complexity of config files and 2/3rd the fact that YAML should have never been used as a configuration format in the first place. It's benefits are too small, and it's quirks make too many problems for that role, outside of really trivial stuff like a throwaway Dockerfile.
Want config? Use JSON. And if you need something more "human friendly", use TOML.
[1]: https://github.com/cblp/yaml-sucks
[2]: https://changelog.com/posts/xml-better-than-yaml
[3]: https://noyaml.com/
by asylteltine on 1/23/24, 2:08 PM
So true. I abhor yaml. It’s impossible to know correct indentation without a plugin that shows it which isn’t available in many cases where you edit yaml. It’s whitespace sensitive. The data types are not obvious. It’s just all a round bad
I love json. It’s explicit and easy to read. We should just be using json for everything that needs to be human readable.
by spacebanana7 on 1/23/24, 11:25 AM
I wonder whether doing away with config altogether is a solution.
Just build an application that calls AWS APIs directly when you want to deploy or update an environment.