version control - Git huge diff between branches due to cherry picks

Question

Welcome To Ask or Share your Answers For Others

version control - Git huge diff between branches due to cherry picks

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

version control - Git huge diff between branches due to cherry picks

The background

We are developing a small product that has many "versions", each containing different customizations on the themes and texts etc. We currently have one single git repo and put each customized version in a branch. However, the customized contents are scattered everywhere in the repo. Whenever we introduce a new feature or fix some bugs, we do it on one branch, then cherry-pick it (without -x, though I don't know if it matters) to other branches. Then we may change some colors or texts by --amend or during the conflict resolution.

The issue

Each branch has a unique history tree, and git says there are tons of commits behind & ahead of each other. Although the commit contents are almost the same.

The question

Now I am trying to refactor the codes and collect the customizations to one place. After doing so, is there a way to kill the enormous old commit diffs?
If I still want to keep many branches (only for different meta info and server configs, instead of injecting those values at compile time via build tools), do I have better solutions than cherry-pick? Will they depend on the resolution of the old commit diffs mentioned in 1? (i.e. merge or rebase) (But ideally, we don't need to amend the commits after picking them, as the customized contents will be extracted out)
Is there any performance or disk usage issues for git to have large commit diffs (different history tree) between branches but minor code diffs between branch heads?

question from:https://stackoverflow.com/questions/65879291/git-huge-diff-between-branches-due-to-cherry-picks

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:21:31+0000

The fact that you have large diffs between branches is not intrinsically a problem. Git stores data as a set of snapshots of the repository at each commit, not as a series of diffs between branches. It does, when packing, store individual objects as deltas against other similar objects, but unless you are storing many binary objects, this usually isn't a problem. So you're probably fine.

However, the approach of keeping one main branch and cherry-picking is, as you've found out, a bit of a maintenance nightmare. As you've hinted at, it would be better to have one unified code base with some configurability plus a set of configuration options for each version.

If you go that route, then you can either have just a main branch with a directory or directories full of config files, or you can have a main branch with a separate branch that differs only in its config file. With the latter approach, your best best is to just merge the main branch in instead of cherry-picking, or rebase your subsidiary branches onto the main branch. Rebased branches can be tricky for others to work with, so I recommend the merge approach.

However, if you insist on cherry-picking, then in the scenario you mention, most of your code isn't going to differ, so git cherry-pick isn't likely to have a problem. The history of a branch beyond the merge base isn't relevant when you're merging or cherry-picking, so having a nasty history shouldn't be a problem.

Categories

version control - Git huge diff between branches due to cherry picks

version control - Git huge diff between branches due to cherry picks

The background

The issue

The question

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags