From 1bf17dfc9d6bde554e3310b5deb7550be54822a2 Mon Sep 17 00:00:00 2001 From: Joshua Simmons Date: Sat, 30 Aug 2025 13:03:16 +0200 Subject: [PATCH] Rework we-have-github-at-home article --- site/posts/we-have-github-at-home.dj | 275 +++++++++++++-------------- 1 file changed, 132 insertions(+), 143 deletions(-) diff --git a/site/posts/we-have-github-at-home.dj b/site/posts/we-have-github-at-home.dj index c76cef7..4ceb823 100644 --- a/site/posts/we-have-github-at-home.dj +++ b/site/posts/we-have-github-at-home.dj @@ -4,131 +4,127 @@ author: Josh Simmons slug: we-have-github-at-home published: 2025-06-02 modified: 2025-06-02 +modified: 2025-08-25 tags: programming git web rust at-home --- -In typical engineer fashion I decided last weekend to quickly throw together my -own github. This of course being a small weekend task. Let's compare feature -lists: - -| | github | nega.tv | -| -------------------- | :----: | :-----: | -| Git Hosting | ☑ | ☑ | -| SSH Repo Access | ☑ | ☑ | -| HTTPS Repo Access | ☑ | ☑ | -| Web Code View | ☑ | ☑ | -| Issue Tracking | ☑ | ☐ | -| Code Review | ☑ | ☐ | -| Automation | ☑ | ☐ | -| Project Wiki | ☑ | ☐ | -| Analytics and Social | ☑ | ☐ | -| AI Dogshit | ☑ | ☐ | - -Recently I've found my enjoyment of the github platform rapidly declining, along -with my expectation of future enjoyment. It seems some boffin at microsoft hq -has commanded a hard pivot to AI and thus everything is going to get The -Treatment. I also expect they'll start to squeeze what you actually get out of -the 'free' (not that payment is an issue per-se, I just don't want to pay for a -shitty LLM) account. So I figured it's time to migrate away, and vaguely -considered a few options: - - - [SourceHut](https://sourcehut.org/) is a bit too much for me, with - mailing-list driven workflows, and requirements that don't really scale for - game development (no support for large repositories). - - - [Codeberg](https://codeberg.org/) is open-source exclusive, which is cool - but I don't want to be stuck with only public repos. +The art of writing software is the art of willful underestimation, combined with +creative choice of goalposts. Last weekend I decided to quickly throw together +my own github. + +Over the past few years I've found my enjoyment of the github platform rapidly +declining. It seems some boffin over at microsoft hq has commanded a hard pivot +to AI and we regular folk are to be left dealing with the fallout. + +It's also about time they start squeezing on cost and features. I would be happy +to pay for github (and in fact, I was happy paying), however, I'm not going to +pay for llm garbage. For me at least, github has become sourceforge. + +But what to use instead? + + - [SourceHut](https://sourcehut.org/) is a bit much for me, mailing-list + driven workflows aren't my vibe, and anyhow their lack of support for large + repositories is a non-starter for games. + + - [Codeberg](https://codeberg.org/) is open-source exclusive, which is cool, + but I don't want to be stuck with only public open-source repositories. - Self-hosted [Gitlab](https://about.gitlab.com/) is really just a pure clone - of github, AI memes and all, which didn't feel like an improvement. Plus - it's gigantic and largely unnecessary for my needs. + of github, LLM memes and all, which is not an improvement. It's also just + pure overkill and complex to host. - [Forgejo](https://forgejo.org/) has a cringe name. - - I didn't actually think of [Bitbucket](https://bitbucket.org/product/) at - the time, but urgh Atlassian. And more AI memes. + - [Bitbucket](https://bitbucket.org/product/) is made by Atlassian... Plus more + LLM nonsense. - [Azure Git Repos](https://azure.microsoft.com/en-us/products/devops/repos) - would maybe work, but I expect it's only a matter of time for that to just - be github in a trenchcoat with all the same problems. Plus they just can't - manage to be normal and have a normal pricing page. - -With all the reasonable options out of the way, the only solution left is to -self-host our own thing. I already had this VPS sitting idle anyway and I hate -free time so this leaves us with three major issues. + Just can't manage to build a normal pricing page. Also under the microsoft + umbrella, so hard to imagine it not becoming regular github over time. + +With the reasonable options out of the way, the only choice remaining is to build +and self-host my own software! The VPS which hosts this site is mostly idle +anyway, and there's always room for one more hobby project in the clown car +called "free time". + +Let's compare feature lists: + +| | github | git.nega.tv | +| -------------------- | :----: | :---------: | +| Git Hosting | ☑ | ☑ | +| SSH Repo Access | ☑ | ☑ | +| HTTPS Repo Access | ☑ | ☑ | +| Web Code View | ☑ | ☑ | +| Issue Tracking | ☑ | ☐ | +| Code Review | ☑ | ☐ | +| Automation | ☑ | ☐ | +| Project Wiki | ☑ | ☐ | +| Analytics and Social | ☑ | ☐ | +| AI Dogshit | ☑ | ☐ | # SSH Access -In the most basic variation ssh access for git repos is trivial, you create a -new linux user for everybody with git access, and set their shell to [git-shell](https://git-scm.com/docs/git-shell) -so they can't do anything much other than push changes. However, that approach -is a bit questionable if you plan on having more than a handful of users, or -more than a single server, or to give multiple users commit access to a single -repository. - -I had no plan to do any of those things, but I wanted to support them anyway, -and to isolate git access behind a 'git' user account in the way that the Real -Github does it. We're serious engineers here. Once again there are a few turnkey -options, most notably [gitosis](https://github.com/tv42/gitosis) and -[gitolite](https://gitolite.com/gitolite/index.html), but we'll roll our own. - -There's a handy feature of [sshd](https://linux.die.net/man/8/sshd) which we -will use, `command=` in the `AuthorizedKeys` file. Essentially, instead of -executing the user's login shell (configured in `/etc/passwd`), we can -configure — per authorized key — the specific command to run after -authentication. This enables us to associate a "git user" with each public key, -and interject our own code to decide whether to authorize a particular action -based on that information. - -For example, git user account's `authorized_keys` file might look like: +Basic SSH access for git repos is trivial, + + 1. Create a linux user account for each git user. + 2. Set their shell to [git-shell](https://git-scm.com/docs/git-shell) + +However, that approach breaks down if you plan on more than a handful of users. +It also fails if you need fine-grained access control where multiple maintainers +can share a single account. + +I had no plan for either of those things, but I wanted to support them +regardless. I also wanted to isoloate git access behind a 'git' user account in +the way that the Real Github does it. I'm a serious engineer. + +Once again, there are a few turnkey options, most notably [gitosis](https://github.com/tv42/gitosis) +and [gitolite](https://gitolite.com/gitolite/index.html), but I'll roll my own. + +The goal is to have a single git user, which everyone can access, and then to +filter repository access based on explicit permissions. There's a handy feature +of [sshd](https://linux.die.net/man/8/sshd) which I use, `command=` in the +`AuthorizedKeys` file. + +Essentially, instead of executing the user's login shell (configured in +`/etc/passwd`), I can configure per authorized key a specific command, +including arbitrary parameters, to run after authentication. + +For example, the git user account's `authorized_keys` file might look like: ``` -command="/home/git/git-shell-multiplex josh",restrict sk-ecdsa-nsa-backdoor AAAA... -command="/home/git/git-shell-multiplex sophie",restrict sk-ecdsa-nsa-backdoor AAAB... -command="/home/git/git-shell-multiplex bazza69",restrict sk-ecdsa-nsa-backdoor AAAC... +command="/home/git/git-shell-multiplex josh",restrict sk-ecdsa-nsa-backdoor ... +command="/home/git/git-shell-multiplex sophie",restrict sk-ecdsa-nsa-backdoor ... +command="/home/git/git-shell-multiplex bazza69",restrict sk-ecdsa-nsa-backdoor ... ``` -That alone is enough to associate public keys with a user, but we still need to -implement `git-shell-multiplex` (in Rust, of course) to enable git access -without breaking our security model. - -The goal here is to only accept the three commands that a git remote actually -requires, and to validate the permissions for the repo accessed against the -username associated with the public key that we setup in `authorized_keys`. Rust -is to some degree, a terrible choice for this kind of scripting, but we don't -let bad ideas get in the way of doing whatever we like and the script is -[here!](https://git.nega.tv/?p=josh/git-shell-multiplex;a=blob;f=src/main.rs;h=db2d9cbc1c3dbf763a6204c8fc0a2b39f6c7e30f) - -For permissions this currently re-uses the `git-daemon-export-ok` marker file to -determine whether a project is *public* (readable by anybody with an account, -this file name will make sense later), and looks for a file called -`git-shell-multiplex-contributors` for a list of users with write access in -addition to the *owner* of the repository (the one whose name is at the front of -the path e.g. `josh/repo.git` would be owned by the user "josh"). - -With these pieces in place, we can do `git clone git@nega.tv:josh/narcissus.git` -and `git push origin main`! Not too hard after all. - -One downside of this is the configuration nightmare contained in -`authorized_keys`. Since we have one user it's no big deal, but it's worth -noting that you can also replace the `authorized_keys` _file_ with an -`authorized_keys` _command_ using `AuthorizedKeysCommand` in -[sshd_config](https://linux.die.net/man/5/sshd_config). This could then find the -appropriate keys and user configuration directly from some database, and -likewise the multiplexer itself could source the user and permission data from a -central location rather than digging around looking for specially named files. +Then, I implement _git-shell-multiplex_ (in Rust, of course) to run permission +checks and validate users are only using git commands. Rust is a questionable +choice for this kind of scripting, but I don't let bad ideas get in the way of +doing whatever I like. The script is [available here!](https://git.nega.tv/?p=josh/git-shell-multiplex;a=blob;f=src/main.rs;h=db2d9cbc1c3dbf763a6204c8fc0a2b39f6c7e30f). + +In order to define the repository permissions I'm currently reusing the `git-daemon-export-ok` +marker file to enable read-access, and a new file, `git-shell-multiplex-contributors`, +which contains a list of users with write access[^write-access]. + +[^write-access]: This is *in-addition* to the account which contains the +repository. So 'josh' always has full access to 'josh/repo.git' + +With that sorted, `git clone git@nega.tv:josh/narcissus.git` and +`git push origin main` work! Not too hard after all. + +One downside is the configuration nightmare in the `authorized_keys` file, but +I'm just one person so it's not a big deal. You could also replace the +`authorized_keys` _file_ with an `authorized_keys` _command_ using +`AuthorizedKeysCommand` in [sshd_config](https://linux.die.net/man/5/sshd_config). +This would allow writing a single script to look up the appropriate keys and +configuration from a database shared with the multiplexer. # HTTPS Git Access Git has two different [http protocols](https://git-scm.com/docs/http-protocol), -a simple v1 protocol, and a smart v2 protocol. This doesn't matter too much for -us because I don't care about pushing via https, so we're basically just going -to setup [git-http-backend](https://git-scm.com/docs/git-http-backend) in -read-only configuration. This requires fastcgi to be -configured! - -My server is using [nginx](https://nginx.org/), and the configuration shown will -reflect that. +a simple v1 protocol, and a smart v2 protocol. Since I don't care about pushing +over https, I just setup [git-http-backend](https://git-scm.com/docs/git-http-backend) +in read-only configuration. `nginx.conf (excerpt)` @@ -145,28 +141,24 @@ location ~ ^(.*)\.git/(HEAD|info/refs|objects/info/.*|git-(upload|receive)-pack) } ``` -In short, we filter out a bunch of git-specific paths and punt them to -`git-http-backend`. Of special note is the captures in the location regex, -they're reconstructed into `PATH_INFO` somewhat strangely so that we can expose -a http url in the form `https://git.nega.tv/josh/git-shell-multiplex.git`, and -have it find the repo folder itself at `/var/git/josh/git-shell-multiplex/`, -without the `.git` suffix. +I filter out a bunch of git-specific paths and punt them to _git-http-backend_ +over fastcgi. Of special note are the captures in the location regex; they let +me drop the `.git` suffix when constructing `PATH_INFO` so an incoming url like +`https://git.nega.tv/josh/git-shell-multiplex.git` invokes _git-http-backend_ +without the suffix, finding the actual repository at `/var/git/josh/git-shell-multiplex/`. -`git-http-backend` looks for the file `git-daemon-export-ok` in a repo, and only -allows access to those with the marker file. This actually leaks the existance -of private repos, since it reports 'no permissions' rather than 'not found' but -for our purposes this isn't a big deal. +_git-http-backend_ looks for the file `git-daemon-export-ok` in a repo, and only +allows access to those with the marker file. This leaks the existance of private +repos, since it reports 'no permissions' rather than 'not found' but for my +purposes this isn't a big deal. # Web Interface There are two simple options for a web based repository viewer, [cgit](https://git.zx2c4.com/cgit/about/), and -[gitweb](https://git-scm.com/docs/gitweb). Since gitweb is hosted as part of the -git distribution I chose that one, but honestly I'm not super excited about -either option. This requires fastcgi. - -My server is using [nginx](https://nginx.org/), and the configuration shown will -reflect that. +[gitweb](https://git-scm.com/docs/gitweb). Since _gitweb_ is hosted as part of the +git distribution I use that one, but honestly I'm not super excited about +either option. This also requires fastcgi. `nginx.conf (excerpt)` @@ -186,23 +178,23 @@ location / { } ``` -There's nothing really interesting in the config here, in `gitweb.conf` we -enable syntax highlighting and that's about it. Everything is as it comes. -gitweb will only expose repositories which contain the `git-daemon-export-ok` -marker file. +There's nothing exciting in the config here. In `gitweb.conf` I enable syntax +highlighting and blame, but that's it. In the same way as _git-http-backend_, +_gitweb_ is configured to only expose repositories which contain the +`git-daemon-export-ok` marker file. # FastCGI Setup -Honestly I'm somewhat baffled by the fact that CGI still exists in the current -day. One of the first things I ever did with a webserver, trying to configure a -CGI script, continues to be a thing which exists in the current day. Incredible. +Honestly I'm somewhat baffled by the fact that CGI still exists. One of the +first work tasks I ever was given was configuring CGI scripts, and somehow this +practice continues in the current day. Incredible. -Both [gitweb](xnoxronsnxroorxy), and [git-http-backend](zrqzpyrszyrsrzol) use -CGI, so we're going to set it up to play nice with the git user and the httpd -user, and we're going to do all of this in a systemd service triggered by a -systemd socket. +Both _gitweb_, and _git-http-backend_ use FastCGI, so I set it up to play +nice with the git user and the httpd user. For sanity, I set this up as a +systemd service triggered by a systemd socket. `/etc/systemd/system/fcgiwrap.socket` + ``` [Unit] Description=fcgiwrap Socket @@ -231,20 +223,17 @@ StandardError=syslog Also=fcgiwrap.socket ``` -Take note of the user and group of the service, the user being git avoids issues -with the git repository being accessed (via gitweb) from a user that does not -own the repository, and the group being nginx gives access to the socket to the -nginx service which actually hosts everything. This is probably not strictly -ideal, you maybe want to add one or the other group to one or the other user -instead, but it was enough to get us up and running. +Take note of the user and group. git tooling doesn't like repositories that it +doesn't own, so I make the owner 'git', and nginx needs access to the unix socket, +so I use the 'nginx' group. This is probably not ideal - you might want to change +the groups of each user instead. -Also note the `StandardError=syslog` line. You're going to want this when you -try to debug why the stupid CGI script isn't working properly. +Also note the `StandardError=syslog` line. It's important when trying to debug. # Future Work -This is enough to achieve the most basic basics of a scm host. It's good enough -for me, but I wouldn't mind having some more bells and whistles. +I've duct-taped enough software together to create the basics of a scm host. It's +good enough for me, but I wouldn't mind having some more bells and whistles. - Issue Tracker. @@ -259,6 +248,6 @@ the code view so it's actually explorable. file soup. However these are all significantly more work than is achievable in a single -weekend, so they'll have to wait until next weekend. +weekend; they'll have to wait until next weekend. -Finally, you can find the fruits of my labors over at [git.nega.tv](https://git.nega.tv/) \ No newline at end of file +You can find the fruits of my labors over at [git.nega.tv](https://git.nega.tv/) \ No newline at end of file -- 2.49.0