Post

Field notes on Git Maintenance

Field notes on Git Maintenance

TL;DR Field notes on Git Maintenance

Git is a text-based file system for software configuration management, optimized for tracking changes and compressing text files efficiently.

Regular Git maintenance helps to:

  • Improve Git performance
  • Facilitate migration to another hosting platform
  • Reduce repository size and complexity

Below are practical methods for Git maintenance, ranging from simple cleanup to advanced repository surgery.

Simple Cleanup: Delete Stale Branches or Tags

Remove unused branches or tags from the remote repository:

1
git push origin --delete <branch_or_tag_name>

Advanced Cleanup: Repository Surgery

Always clone a fresh copy of your repository before performing advanced operations. These actions rewrite history and are irreversible after pushing.

To clone a mirror repository for safe experimentation:

1
2
3
git clone --mirror <repo.git url>
cd <project>
git remote set-url origin <repo.git url>

Lever 1: Migrate Large Files to Git LFS

Use Git Large File Storage (LFS) to manage large files efficiently.

Step 1: Install Git LFS:

1
git lfs install

Step 2 (Optional): Configure custom LFS storage location:

1
git config lfs.url https://<artifactory-instance>/artifactory/api/lfs/<repository-name>

Step 3: Track specific file types with LFS:

1
git lfs track '*.<file_extension>'

Step 4: Migrate existing files to LFS:

1
git lfs migrate import --include='*.<file_extension>'

Step 5: Verify LFS-tracked files:

1
git lfs ls-files

Lever 2: Remove Unwanted Files by Rewriting History

Use git-filter-repo to permanently remove unwanted files from repository history.

Step 1: Install git-filter-repo.

Step 2: Analyze repository history:

1
git filter-repo --analyze

Step 3: Review large or unwanted files in .git/filter-repo/analysis/path-all-sizes.txt.

Step 4: Remove unwanted files using filters:

1
2
3
4
5
git filter-repo --invert-paths --path <dir_or_file>
git filter-repo --invert-paths --path-glob <glob>
git filter-repo --invert-paths --path-regex <regex>
git filter-repo --strip-blobs-bigger-than <size_eg_10M>
git filter-repo --strip-blobs-with-ids <blob_id_filename>

Step 5 (Optional): Backup commit-map for reference:

1
cp filter-repo/commit-map ./_filter_repo_commit_map_$(date +%s)

Lever 3: Drop Specific Commits

Remove unwanted commits from history using interactive rebase.

Step 1: Identify problematic commits (use tools like git-sizer):

1
git-sizer --verbose

Step 2: Start interactive rebase from a commit before the unwanted commit:

1
git rebase -i HEAD~3

Step 3: Edit the Rebase Todo List. This will open an editor with a list of commits. The list might look something like this:

1
2
3
pick i7j8k9l Initial commit
pick e4f5g6h Fixed a bug
pick a1b2c3d Added new feature

This will begin an interactive rebase from a point before the commit you want to remove.

Step 3: In the editor, mark the unwanted commit with drop:

For example, if the commit e4f5g6h is the second commit to drop, you would update drop next to it.

1
2
3
pick i7j8k9l Initial commit
drop e4f5g6h Fixed a bug
pick a1b2c3d Added new feature

Step 4: Save and exit the editor to apply changes.

Git will then replay the commits, excluding the one you marked to drop.

Lever 4: Prune Reference Logs

Clean up old or unreachable references to reduce repository size.

Option 1: Remove unreachable references immediately:

1
git reflog expire --expire-unreachable=now --all

Option 2: Remove all references older than now:

1
git reflog expire --expire=now --all

Lever 5: Garbage Collection

Clean up dangling objects and optimize repository storage.

Step 1: Review unreachable and dangling objects:

1
2
git fsck --full
git fsck --unreachable

Step 2: Remove dangling objects:

1
git gc --prune=now --aggressive

Step 3: Verify integrity and connectivity of all objects:

1
git fsck --full

Final Step: Push Changes to Remote

After performing advanced cleanup, push to rewrite remote history:

Option 1: Rewrite all references (branches, tags, remotes):

1
git push origin --force --mirror

Option 2: Rewrite all branches only:

1
git push --all --force

Option 3: Rewrite specific branches:

1
git push origin --force 'refs/heads/*'

Option 4: Rewrite all tags only:

1
git push --tags --force

Option 5: Rewrite specific tags:

1
git push origin --force 'refs/tags/*'

Force-pushing rewrites remote history. Ensure all collaborators are informed and have synchronized their local repositories accordingly.

This post is licensed under CC BY 4.0 by the author.