from Hacker News

Resistance Against Git Merge Hell (2015)

by qsantos on 5/1/24, 4:02 AM with 111 comments

by pjc50 on 5/1/24, 10:58 AM
This is a very short advert for rebasing your personal branch.
People seem to have very strong opinions about this. I don't much, either way; I have to use the gerrit flow at work, which mandates a lot of rebasing, and in general I prefer not having a commit which only records a merge. Some people seem to want highly detailed tracking of what an individual developer has done in their personal checkout. I will only note that the act of observing changes what is observed.
by wobblyasp on 5/1/24, 10:12 AM
Clean is such a weasel word. Does it matter if you visual tooling produces a series of squiggly lines?
The information is all there. You can grep it to find specific commits.
This isn't a hill worth dying on with your team if it's their MO already.
by INTPenis on 5/1/24, 11:33 AM
I was guilty of merge hell until I learned about rebase.
I think it's a natural instinct to want to protect your production environment so I'm sure others have made the same mistake as I did. Main is my default branch, I don't want anything pushed into the default branch to result in a deployment to production.
So naturally I create a new branch called production. The relationship makes sense to me because we develop in main, might even deploy to staging from main, but it's not until we feel we're ready that we merge main with production. And to a user of git this requires a manual step where you explicitly specify the word production, git checkout production.
But that resulted in merge hell until I learned I could just rebase main onto production.
And the command structure is even exactly the same, standing in production branch I either do git merge main, or git rebase main.
This comment was a message to my younger self.
by hardlianotion on 5/1/24, 10:48 AM
Oh. I wanted this to be about the London Tube Map.
by giantg2 on 5/4/24, 3:31 PM
Most merge issues can be avoided with proper architecture and planning. If your files are so large that you have multiple devs working in the same file at the same time on a regular basis, then it's probably time to get more modular.
I honestly hate Git compared to SVN. The Git tool seems technically better. However, the easier aspects of the tool seems to have promoted less planning, more stepping on each other's toes, and what I see as non-value added work (merges, rebaseing, retesting, etc).
by Aethelwulf on 5/1/24, 10:17 AM
Do people really look at their git commit history like this? Why?
by planede on 5/1/24, 11:19 AM
problem statement: git history, as normally presented, is hard to follow and contains too much noise.
article's proposed solution: simplify the git history by destroying information that are irrelevant. That's what rebase is.
The problem is what is or isn't relevant depends on context. I think the right way to go about it is to simplify the presentation and otherwise improve tooling to get information out of the git history. git already has some ways to filter the history, but it lacks very feature rich query language, like mercurial. git guis should also step their game up.
by breckenedge on 5/1/24, 12:43 PM
Surprised to not see a mention of the —no-merges flag.
by juped on 5/4/24, 3:41 PM
What a disjointed writeup of a terrible idea. We only use git _because_ its history is a graph rather than a sequence, freeing us from the necessity to ever acquire locks on fellow humans. There's a reason git.git never does anything like this, despite the history being completely constructed by Junio Hamano (who applies patches with git am, rather than taking a set of commits from someone's remote as is typical on Github). A clean history full of useful information looks like git.git's history: nonlinear.
by adityaathalye on 5/4/24, 2:38 PM
Rebasing-only is a manual way to "linearise history". Whereas, a git log is a query target. And the `git-log` tool is one's friend. I bash-alias these for quick use:
```
  git log --oneline # `gl`, to see "linear" history with branch annotations
  git log --oneline --merges # `glm`, to see merge commits only
  git log --oneline --graph # `glg` see the train tracks too, if I need to
```
And I know that git log has me covered, if/when I need to narrow / slice history more.
by stevage on 5/4/24, 1:32 PM
So why can't the tools that display history just filter out all the merge commits? I don't really understand why this requires merges to be done differently at the time.
by cess11 on 5/4/24, 3:29 PM
I have never seen something like the example consisting of pretty much only merges, and most places where I've worked didn't use rebase or squash or whatever it's called. It has also been uncommon with 'fix' and 'did a thing' commit messages, usually people type in what they actually did, except in one code base but that was someone who really didn't enjoy software development.
by tilsammans on 5/1/24, 10:42 AM
This doesn't look dirty or confusing to me at all.
by jFriedensreich on 5/4/24, 3:55 PM
To be able to work with large repos i found a few more important strategies:
- in your git client: enable "only follow first parent" for the base branch which only shows merge commits and only open merge commit history as needed/ when relevant, i'm only aware of fork.app doing this nicely but should be an option in more git clients
- hide all branches by default and only show your base branch and your own work, just let go of caring what others are doing, the only way to interact with other branches should be a review system at which point you can selectively pull in the relevant commits as needed to try locally. (exception is maybe a technical manager or team lead who should of course care somewhat if branches get abandoned or not cleaned up properly, but this is a completely separate workflow)
- Look at saplings notion of public vs draft commits, this is a game changer and works just as well for git, just not supported by the tooling in the same way. Don't try to argue and adopt workflows that are the same for both, they are completely different phases of work with completely different needs. In a nutshell: a) public commits are the ones that were pushed to a branch that is shared with anyone else. This maybe main, a feature branch worked on by multiple colleagues etc. These commits are considered immutable, never amended or rebased so the only option to bring them up to date with another branch is a merge. b) BUT every commit that is not part of one of these branches is considered a "draft" these are never merged into, always amended or modified and mutable draft state. It is considered rude to expose your internal working struggles like merging in master 100 times a day to avoid big merge conflict buildup or reverting changes into your work submitted to review, you submit a clean stack of pull requests, one per reviewable batch of changes
- rebasing and amending has a few other big advantages. One of them is that syncing to the base branch does not pollute history and can be done as often as possible without a down side. This allows keeping all work in sync without buildup of big hard to resolve conflicts. There is also tooling to do this for all your open work at once and to resolve conflicts smarter (.eg git rerere)
- merges should be set to allow only squash + rebase as this works the same for devs working with clean commits as well as devs keeping their commit history in their branches. this way at least the base branch is mostly clean
by Charon77 on 5/4/24, 3:29 PM
There's this handy git config 'pull.rebase' in case you miss it.
A lot of the merge commit was triggered when you pull and merge instead of rebasing locally.
https://git-scm.com/book/en/v2/Git-Branching-Rebasing
by gsliepen on 5/1/24, 11:48 AM
Ideally, your main branch is always in a known good state (every commit results in compilable code that passes the tests you have so foar), which makes it bisectable. Keeping all commit (including broken ones and subsequent fixes) in topic branches and then merging them without some squashing and rebasing gets in the way of that.
by larsnystrom on 5/1/24, 11:33 AM
The more I work with git, the more I wish there was a rebase (including squash/fixup) which kept the original commits, but hides them. I’m not sure how that would work in practice, but there is value in keeping all change history, and there is also value in having a readable commit history, but git does not let you do both.
by Ferret7446 on 5/5/24, 7:34 AM
Very few Git users use a true distributed workflow that merge is designed for; naturally, most Git users would have no use for merge.
Merge is useful when there are multiple "canonical" repos which occasionally merge changes in between each other. Think web of trust vs central authority.
by IshKebab on 5/1/24, 12:37 PM
I definitely agree with this. If you merge to this extent then you've basically given up on having an understandable history. Rebase is much better where possible.
I think a lot of the disagreement about this is really people talking about different things. Some people say "always squash; nobody cares about the trivial typo fix commits and whatnot" and other people say "never squash; you lose important history" and really they're both right... you should squash when it's not important to preserve the history. Obviously people are going to disagree about when that is but in my experience if a PR is big enough that you think it should be more than one commit then it's too big, unless it's a big feature branch that has been worked on by multiple authors.
Similarly with rebase vs merge, if it's a small single author PR then definitely rebase. For big feature branches you may want to use merges though I would still suggest rebase is better. You just need to make sure everyone is using the safety flags when they force push.
by peanut-walrus on 5/1/24, 10:59 AM
You squash merge when stuff gets added to master (or any other shared/long-lived branch) and delete the development branch, nobody has absolutely any interest in what happened on your development branch. There you go, clean commit history if you care about that sort of thing.
Rebase workflows are awful and unintuitive. Leave the rebasing to the git wizards who actually know what they're doing, in no circumstances should this be part of your day-to-day work.
by echelon_musk on 5/1/24, 12:39 PM
> end up with git commit history which looks like London tube map
I've saved you a click. TFA has nothing to do with TFL beyond this line.
by fargle on 5/4/24, 2:48 PM
the way i look at it is this: we actually use git for at least two different but related things
- a public change history (e.g. commit history of master, dev-1.0, whatever)
- a personal change history, especially useful when prototyping a fix or feature on some independent branch.
you might even call this independent branch "master" but it's your personal version of master in a clone on your laptop. it's a time machine, you can go backward/forward. you can make a change and then reverse course. you can commit every work-in-progress that compiles as you work get it running or passing tests. whatever you like. you can pull or merge or rebase as you wish.
when submitting it to the public "master" branch, you are effectively saying "i want to commit this package of changes on top of the master branch.
do you really want to see "fhgiry pulled from master into fix-splork-feature on date x" and https://xkcd.com/1597/ in your public history of master? every branch you merged from or rebased on and every failed attempt and spelling mistake and WIP and addressing of review comments?
or do we want to see a series of commits that just record the end result in a series of nice little patches that simply add a finished changes like: "splork: fix canoe paddle explosion issue by decreasing default gamma"
the common ubiquitous guidance to "never rewrite history" is nearly always valid for the public history.
converting a messy personal history into a neat series of patches/changes/commits that apply cleanly onto the latest public master should not fall under that same guidance. i'd say it's closer to "never rewrite public history"