by karagenit on 2/1/24, 3:46 PM with 398 comments
by schacon on 2/1/24, 5:26 PM
The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem. Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.
The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project. But for better or worse, that is not generally the role of this text today. Almost nobody ever sees it. Unless it's discussed in a bunch of patch series over a mailing list, nobody reads anything other than the first 50 chars of the headline. It's actively difficult to do, by nearly every tool built around the Git ecosystem.
Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.
This is one of my biggest complaints with Git (or, indeed, any VCS before it), and I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.
If you want an example of this, search through the Git project's history. Run a blame on any file. It's _so hard_ to figure out a story of any function implementation in any file, but the commit messages are _pristine_. Paragraphs and paragraphs of high quality explanation for almost every single commit. Look at any single commit that Jeff King has done for the last decade. Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.
I don't know exactly what the answer is, but the sad truth of Git is that writing amazing documentation via commit message, for most communities, is almost entirely a waste of time. It's just too difficult to find them.
by rpsw on 2/1/24, 4:49 PM
Tells me exactly what the problem was straight away, but I'm still free to choose to read more if I want to know more.
by spenczar5 on 2/1/24, 4:22 PM
It’s a bit sad, but I have a growing suspicion that beautiful commit messages are a bit of vanity by the programmer. The person primarily impressed is often the author; others will walk on by without noticing.
There is room sometimes for those aesthetic flourishes but I am not convinced they have much practical value, and I have stopped really being bothered by commit messages of “fix whitespace issue” from others. I think I am a better colleague for that.
Things might be different on a project like Git or Linux with huge distributed teams and tons of commits, versus the projects I am used to which have between 1 and 100 contributors, mostly from the same organization.
by gumby on 2/1/24, 5:30 PM
The key is not to put what you did in that first line, but why. Anyone interested in what can just look at the code, perhaps via a diff.
So something like "nginx .conf files must be in us-ascii"
Then "changed blahblah.erb to remove nonbreaking space character"
Then the rest of the commit message which is quite good.
Think of it as a news article: write in decreasing levels of importance and increasing levels of detail, assuming the reader could stop reading at any point.
by adrianmsmith on 2/1/24, 4:46 PM
(I mean you could obviously with "rebase" but are you really going to alter something written one year ago, already merged to "main", and cause a bunch of pain with everyone's feature branch etc.?)
Compare that with documentation stored in a .md file, or even a Wiki or even Confluence. My colleague can write something and if I see a way to improve it I can go ahead and do that, and other colleagues can improve on what I've written.
In this particular case I suppose the bug is fixed and won't come up again. But I also myself find it tempting to describing the design of a particular component when I commit that component, and that's something I now avoid. What about when that component needs to be changed by a future commit e.g. due to the business requirements changing? Will the commit documentation just describe the differences? Then in order for a new team member to find out how the system works by reading the documentation they've got to read multiple commit messages and "merge" them in their head.
by OJFord on 2/1/24, 4:19 PM
> I wouldn’t expect all commits (especially ones of this size) to have this level of detail.
(emphasis added) - actually in my experience it's often the little ones, innocuous looking things that might really need a relatively longer explanation.
Yesterday I wrote three paragraphs on why I added `--limit=999` to a `gh pr list` because it's confusing: there's already a `limit(` in the `--jq` argument, and the higher it is (given say infinite PRs in total) the lower the end result will actually be. (Yes I wrote a comment too. And probably spent even longer thinking about and working it up than writing about it; hopefully I'll recall it as an example the next time someone implies the job is about churning out code!)
by macspoofing on 2/1/24, 7:14 PM
1) For all that text, the first line "Convert template to US-ASCII to fix error" - could be better. Maybe a couple of extra words to state what whitespace character caused the error, and what the error was. That comment plus the diff is all the context you need.
2) Honestly, everything else is kind of pointless. It doesn't hurt, but there's not a lot of value here. The author documented their journey in tracking this bug .. who cares?
by bhasi on 2/1/24, 5:06 PM
The first line always mentions the subsystem affected by the change, followed by a one-line imperative-mood summary of the change. Subsequently, three questions are answered in as much detail as possible:
1. What is the current behaviour? 2. What led to this change? 3. What is the new behaviour after applying this change?
Example:
"Currently, code does X. When running test case T, unexpected behaviour U was observed. This is because of reason R. Fix this by doing F."
by RustyRussell on 2/1/24, 10:43 PM
If about existing code, the comment belongs with the code. If it's a process thing (e.g. code that is removed or didn't work), it belongs in the commit.
Most importantly, while commit messages can reference issues for convenience, they MUST reproduce the critical details: GitHub is transient, git messages are not!
by keybored on 2/1/24, 4:38 PM
This is so common that the maintainer wrote this[2]
[1] https://github.com/git/git/commit/d70f554cdf38b0b05cfaa8e8eb...
by simonw on 2/2/24, 1:20 AM
Then a friend pointed out that I was effectively writing documentation and hiding it in commit messages.
Instead, I switched a lot of that effort to updating actual documentation (in a docs/ folder) that was relevant to the commit - so the commit would still have the information in it, it's just it was in an actual file and not just the commit message.
I also make sure my commits almost always link to an issue thread, as that's a great place to put all kinds of extra context around the commit that can be updated independently of the commit itself.
by ryandrake on 2/1/24, 6:09 PM
"ArgumentError: Invalid byte sequence in US-ASCII" is a terrible, hard-to-action error message. What file? What line? What byte sequence? This "let's give the user another problem to solve" style of error messages is pervasive in our tools.
Also, why does the tool even require US-ASCII as input in the first place? Are we still living in 1995?
Also, if only ASCII characters are allowed, why does the code editing tooling allow non-breaking spaces in source code? Is there a good reason for having such a character in this file? This problem could have been avoided if the editor could have been smarter or highlighted the "bad" character better.
This developer lost an hour of his life because of a cascading chain of defective tools.
by mrinterweb on 2/1/24, 6:45 PM
by krmbzds on 2/1/24, 5:49 PM
Also, if you're on macOS just use a Karabiner rule [0] that converts all non-breaking space characters to regular space characters to prevent yourself from accidentally typing it out.
[0] https://ke-complex-modifications.pqrs.org/#nonbreaking_space
by gtirloni on 2/1/24, 5:22 PM
The worst I've seen are dozens of tiny commits pushed to the master branch directly. If you want to find out what took to implement a feature, good luck.
I'm a fan of tiny commits during code review but afterwards I prefer to squash everything in a functionally relevant commit. It makes git archeology much easier.
by dgunay on 2/1/24, 11:00 PM
by aeurielesn on 2/1/24, 4:18 PM
by nickm12 on 2/2/24, 3:51 AM
The commit message should explain the change being made, what impact it will have, and why it is being made. The audience is the developers reviewing the change or someone looking through the logs to determine why this line changed.
by Forge36 on 2/1/24, 7:00 PM
by AeroNotix on 2/1/24, 5:34 PM
Do with this information what you will.
9/10 the code already is documentation enough for what the code currently does, if you need to go back through history then look at the commits. The messages are generally noise.
I've literally never cared _why_ someone made a change, I can see the change, I can see the effect of the old and new code. Rarely, if ever, has the thought process ever changed how I will interact with the code in question.
If I am at the level of debugging or history spelunking that the _commit message_ is the thing that saves me - I've already lost and there are other glaring organizational or design issues that are the actual problem.
by tehnub on 2/1/24, 7:29 PM
by thrdbndndn on 2/1/24, 5:55 PM
by WalterBright on 2/1/24, 4:58 PM
P.S. Having multiple Unicode values that exhibit identically when displayed are a huge veer-into-the-ditch mistake. I.e. the notion that code points should have semantic value is simply wrong.
by eduction on 2/1/24, 5:47 PM
BUT it's not super useful to say this is good, the hard part is knowing WHEN to put this much effort in and when you can skip it.
I have many instances where I could do a longer story like this but it would be exhausting to do it every time. I try to do it when the commit might look unclear in intent or effect to an outsider, when the change is being made for an important reason, /and/ where this a potentially negative consequence (like naively reverting or writing bad code) if the change is not explained.
I think this is a decent example of that but not great, because no one is intentionally going to go in and start introducing nonbreaking spaces.
by fl0ki on 2/1/24, 6:59 PM
This will typically also involve completely unnecessary changes, because when you're merging unrelated changes anyway, the unnecessary ones are swept up in the noise. At best they complicate rebases for other contributors, but too often they also cause outright regressions.
It goes without saying that the commit messages are a write-off at this point, because even if they felt motivated to take the time to comment it clearly, the change is so messy and nebulous that it becomes hard to comment on. If their code gets reviewed it's more likely the reviewer gives up and stamps it so they no longer have to look at it.
Most people still don't seem to know that `git add -p`, `git reset HEAD`, `git stash`, `git rebase --interactive`, etc. are even available. They never learn what git is capable of, so they act like version control is a bureaucratic obligation rather than the peerless superpower that it can be. The problems they cause don't end at their terminal though, because now they've made a mess of the repository for every other contributor as well.
by 2devnull on 2/1/24, 8:19 PM
by mo_42 on 2/1/24, 5:28 PM
I think commits should be small steps that display the thought process of the author. Every individual commit should be self-explanatory. So the commit message should not describe (again) what the changes are but why it’s necessary. Sometimes the change is not self-explanatory and then I'd put a longer description below.
Somehow I came up with this on my own, so I'd be interested if it really makes sense or if others have a similar style.
by tomcam on 2/1/24, 11:34 PM
by schnatterer on 2/2/24, 2:06 PM
How to Write a Git Commit Message / The seven rules of a great Git commit message https://cbea.ms/git-commit/#seven-rules
by MeteorMarc on 2/1/24, 6:17 PM
by daitangio on 2/1/24, 8:21 PM
by eternityforest on 2/3/24, 8:06 AM
Something like a vs code command that would add to the message for the commit I'm working on.
by chjj on 2/1/24, 5:12 PM
Something like:
syn match unicodeWhitespace /[list of unicode whitespace]/
hi def link unicodeWhitespace Error
by Zambyte on 2/1/24, 4:39 PM
by pawelmi on 2/1/24, 10:00 PM
by dlvhdr on 2/1/24, 4:21 PM
by darioush on 2/1/24, 7:50 PM
by andrewfromx on 2/1/24, 4:57 PM
https://github.com/andrewarrow/feedback/commits/main/
hp | git commit -a -F -
hp is a golang binary that just spits out a hacker phrase. I have this aliased with the letter q for "quick" so I'm always checking in stuff with q return push done.
by jeffrallen on 2/2/24, 6:51 AM
by merdaverse on 2/2/24, 10:37 AM
If you really like the descriptive, verbose message, then the most important description should be at the top and it should gradually go into less interesting details as you read down, like in a news article.
by eviks on 2/1/24, 6:07 PM
by excitom on 2/1/24, 4:39 PM
by hartator on 2/1/24, 7:51 PM
No need for a very lengthy explanation.
by gfiorav on 2/1/24, 6:48 PM
by ajuc on 2/1/24, 5:01 PM
But I disagree about the format. I prefer commit messages like:
JIRA-123 one-line 80-char-at-most description
Long description if needed (but preferably keep it in JIRA).
by smashah on 2/1/24, 7:10 PM
by vonwoodson on 2/1/24, 7:37 PM
by gavinhoward on 2/1/24, 6:39 PM
by fusslo on 2/1/24, 4:47 PM
As an aside, I'm tired of documenting:
- in code
- in commits
- in jira
- in confluence
- in daily standups
- in release notes
by jiveturkey on 2/1/24, 10:41 PM
by declan_roberts on 2/1/24, 6:59 PM
I must be getting old because I hate this.