Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

version control - Git huge diff between branches due to cherry picks

The background

We are developing a small product that has many "versions", each containing different customizations on the themes and texts etc. We currently have one single git repo and put each customized version in a branch. However, the customized contents are scattered everywhere in the repo. Whenever we introduce a new feature or fix some bugs, we do it on one branch, then cherry-pick it (without -x, though I don't know if it matters) to other branches. Then we may change some colors or texts by --amend or during the conflict resolution.

The issue

Each branch has a unique history tree, and git says there are tons of commits behind & ahead of each other. Although the commit contents are almost the same.

The question

  1. Now I am trying to refactor the codes and collect the customizations to one place. After doing so, is there a way to kill the enormous old commit diffs?
  2. If I still want to keep many branches (only for different meta info and server configs, instead of injecting those values at compile time via build tools), do I have better solutions than cherry-pick? Will they depend on the resolution of the old commit diffs mentioned in 1? (i.e. merge or rebase) (But ideally, we don't need to amend the commits after picking them, as the customized contents will be extracted out)
  3. Is there any performance or disk usage issues for git to have large commit diffs (different history tree) between branches but minor code diffs between branch heads?
question from:https://stackoverflow.com/questions/65879291/git-huge-diff-between-branches-due-to-cherry-picks

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The fact that you have large diffs between branches is not intrinsically a problem. Git stores data as a set of snapshots of the repository at each commit, not as a series of diffs between branches. It does, when packing, store individual objects as deltas against other similar objects, but unless you are storing many binary objects, this usually isn't a problem. So you're probably fine.

However, the approach of keeping one main branch and cherry-picking is, as you've found out, a bit of a maintenance nightmare. As you've hinted at, it would be better to have one unified code base with some configurability plus a set of configuration options for each version.

If you go that route, then you can either have just a main branch with a directory or directories full of config files, or you can have a main branch with a separate branch that differs only in its config file. With the latter approach, your best best is to just merge the main branch in instead of cherry-picking, or rebase your subsidiary branches onto the main branch. Rebased branches can be tricky for others to work with, so I recommend the merge approach.

However, if you insist on cherry-picking, then in the scenario you mention, most of your code isn't going to differ, so git cherry-pick isn't likely to have a problem. The history of a branch beyond the merge base isn't relevant when you're merging or cherry-picking, so having a nasty history shouldn't be a problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...