Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
150 views
in Technique[技术] by (71.8m points)

How to handle a large git repository?

I am currently using git for a large repository (around 12 GB, each branch having a size of 3 GB). This repository contains lots of binary files (audio and images).

The problem is that clone and pull can take lots of time. Specially the "Resolving deltas" step can be very very long.

What is the best way to solve this kind of problem?

I tried to remove delta compression, as it it explain here using the delta option in .gitattributes but it seems to not improve the clone duration.

Thanks in advance

Kevin

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Update April 2015: Git Large File Storage (LFS) (by GitHub).

It uses git-lfs (see git-lfs.github.com) and tested with a server supporting it: lfs-test-server:
You can store metadata only in the git repo, and the large file elsewhere.

https://cloud.githubusercontent.com/assets/1319791/7051226/c4570828-ddf4-11e4-87eb-8fc165e5ece4.gif


Original answer (2012)

One solution, for large binary files that don't change much, is to store them in a different referential (like a Nexus repository), and version only a text file which declares which version you need.
Using an "artifact repository" is easier than storing binary elements in a source repo (made for comparing versions and merging between branches, which isn't of much use for said binaries).

The other solution, more git-centric, is git-annex:

git-annex allows managing files with git, without checking the file contents into git.
While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

It is however not compatible with Windows.

A more generic solution could be git-media, which also allows you to use Git with large media files without storing the media in Git itself.

Finally, the easiest solution is to isolate those binaries in their own git submodule as you mention in your question: it isn't very satisfactory, and the initial clone will still take times, but the next updates for the parent repo will be short.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...