Comparing Git Mirror Options

2 hours ago 1

I wanted to mirror my Git repositories on my infrastructure, mainly for my peace of mind. Mirroring involves cloning everything, so this way, I have a backup should I need it.

I have looked into this several times and have remained ambivalent about the options. I needed a simple automatic and read-only mirror, but some possibilities were overly simple, too complex, or frustrating to maintain and configure. Other options did too much. In total I considered three self-hosted choices.

The three major features I want in a source control mirror are the ability to clone from it, automatic and scheduled pulls from the primary repository, and a UI of some kind that allows being able to toggle between unified and side-by-side diffs.

Only one of the options supports mirroring by itself. For the rest, I had a Bash script running via Cron. The script is longer and includes sanity checks and logging, but its core is straightforward.

1

echo "$REPOS" | while read -r repo; do

2

if [ -d "$repo/.git" ]; then

3

cd "$repo"

4

echo "Updating repository: $repo"

5

git remote update --prune

6

echo "Last 5 commits:"

7

git log -n 5 --pretty=format:" - %h %ad %s" --date=short

8

cd - > /dev/null

9

fi

10

done

This was then setup to run every five minutes.

1

crontab:

2

image: alpine:latest

3

container_name: git-mirror-cron

4

volumes:

5

- git-repos:/srv/git

6

- ./mirror.sh:/mirror.sh

7

command: sh -c "echo '*/5 * * * * sh /mirror.sh' > /etc/crontabs/root && crond -f"

8

restart: unless-stopped

None of the options I found had a definitive winner, but I am particularly drawn to the straightforward, information-dense, compact designs of GitWeb and cgit. I have long preferred this type of UX/UI to the overly spacious form-over-function “modern” trend that plagues computer interface design.

Linux kernel repository shown in cgitLinux kernel repository shown in cgit

GitWeb #

This was the first one I tried. To my surprise, this is part of Git. It has an information-dense and compact design. I was briefly content with this option until I noticed that the Linux kernel source site uses cgit, which already has the features and minor customisation I was expecting GitWeb to contain.

GitWeb’s diff view is lacking - it doesn’t support side-by-side diff and also only has coloured +/- syntax rather than a background colour across the whole line of code.

I also noticed that GitWeb’s performance is lacking. Admittedly, my mirror runs on a Raspberry Pi instead of the more powerful servers beside it. That said, though, the other two options I have written about appear to perform better than GitWeb.

I hypothesise that GitWeb is not caching anything and requires multiple tree traversals for every operation in the UI, such as listing the last commit in every branch of the mirror. The fact that GitWeb is written in Perl is probably of little consequence.

Unfortunately, I couldn’t commit (pun intended) to using it because of an error I couldn’t find a solution for: I couldn’t view blobs. I considered this enough of a showstopper and moved on to the next option.

Interestingly, the Linux kernel uses cgit and not GitWeb. I’ve taken this to mean that even the Git and Linux maintainers would prefer to use cgit.

cgit #

cgit is outwardly very similar to GitWeb but written in C. It has some small but significant improvements. Side-by-side diff, slightly faster navigation through a repository, configurable groups of repositories, and some statistics which are disabled by default.

While it is more polished and customisable to a greater degree than GitWeb, configuring it is a headache. The configuration mechanism is fundamentally flawed in that providing your own configuration file completely overrides, rather than merges values into, the default configuration file - including essential settings like paths to CSS files and other assets critical for a functioning web UI. This left me with three equally unappealing options:

  1. Accept the default configuration as-is, leaving useful settings disabled.
  2. Write a brittle Bash script (I wrote more these problems previously) to locate and modify the config file in the Docker volume, which will increase the maintenance burden.
  3. Maintain a complete copy of the configuration file, which creates an ongoing maintenance burden whenever upstream changes occur in cgit.

This “all-or-nothing” approach to configuration turned what should have been a simple customisation into a disproportionately complex task. I decided to go with the first option while reviewing cgit.

Another major problem with cgit is its implementation of side-by-side diffs. It’s broken. It does not have toggleable word wrap. In fact, there is no word wrap at all. This means diffs frequently extend beyond the width of my monitor, making it difficult to view them properly, especially since the font size is tiny. It’s like the worst of both worlds.

The final major problem is that it does not seem to update its internal cache of repositories. Despite the mirror script running on its schedule, I expected that if I just left it alone for a while, it might scan for any changes in the repositories.

It did not. When I connected to the cgit container shell and ran git log, I could see the latest commits, but they did not show in the UI. I found that I could only trigger a UI update by manually restarting the container or docker compose restart.

Commits would eventually show up, but not in any particular hurry. This occurred consistently, and I was not able to determine why. It may be configurable, but with the previously discussed problematic configuration, I decided to proceed to review the next solution.

Forgejo #

The third and final option I tried was Forgejo. This is a fork of Gitea, which in turn is a fork of Gogs. There seems to be some history and politics involved.

Forgejo’s UI departs from the compact and information-dense UI of GitWeb and cgit. It’s more like a like-for-like clone of GitHub. I don’t consider this a problem; it’s just different from the other two I tried.

Unlike GitWeb and cgit, Forgejo is not simply a web-based repository viewer; it’s an entire solution featuring practically all the features you’d expect from so-called “software forges” such as GitHub and GitLab.

I had some concerns that this would become an overengineered time sink for my modest requirements. Of course, I could not use its numerous features, but my primary concerns are initial configuration and long-term maintenance compared to the lightweight alternatives. I just wanted a simple, reliable mirror.

xkcd #974xkcd #974

It runs either as a simple standalone binary (with proper user permissions and systemd setup) or inside a container. Debian packages are also available, but unless you’re on unstable, it might be a “recent” version in only the broadest possible sense.

I was pleasantly surprised by how easy the setup was. Within minutes, I had a running instance, and completing the welcome screen’s administrator setup took a minute at most.

Additionally, Forgejo has a functional side-by-side diff view, which puts it ahead of the other two options.

A feature I had missed was that Foregjo already supports acting as an automatic mirror, removing the need for my mirror script and Cron job.

There is one downside to this approach, though. My mirror script allows me to provide a Git repository URL, which I can batch up. This can’t be done as easily with Forgejo. There is a forgejo forgejo-cli mirror, which tells me that “F3 is disabled and for development purposes,” whatever that means. Also, talk about redundant commands: forgejo forgejo-cli.

From what I can tell, forgejo-cli is either in early development or has never had much feature work. While it was disappointing to discover I’d need to navigate through web interfaces to set up future mirrors rather than automate the process, I continued my evaluation, otherwise reasonably happy with the platform’s other capabilities.

I chose Forgejo #

I decided to use Forgejo for now. I still prefer the UIs of GitWeb and cgit, and I haven’t ruled out cgit entirely. Although out of the scope of a review of source control mirrors, Forgejo provides a whole host of features like user accounts, CI/CD build pipelines, commit graphs, and pull requests. Also, having features turned on by default instead of off (like cgit) means there was very little to setup at all.

Read Entire Article