Motivation
So, where do we come from? I started using the Mercurial version control system around 2009 if I remember correctly. Had used Subversion and SVK (does anyone remember that?) before and was curious about distributed version control. Back then Mercurial was better suited for platform independent use on Linux and Windows, and I still had to use the latter at work. Mercurial’s user interface was very much the same as Subversion’s and basically just push and pull were added, switching from Subversion to Mercurial was easy. Git was weird at the time, at least for me. Meanwhile we use Git at work exclusively and it got a lot better on Windows over time. However for nostalgic reasons and to stay somewhat fluent in other VCS I kept most of my private projects on Mercurial.
For “hosting” my own repos I used the Debian package mercurial-server, at least up to Debian 9 (stretch), but after upgrading that server to Debian 10 (buster) things started falling apart, and I looked out for a new hosting solution. For the record: I thought about converting all those repos to Git, but opted for not doing that, because I have accumulated quite a number of repos and although I did convert one or two already, I supposed it would be easier to switch the hosting instead of each repo.
Speaking of hosting: I don’t need a huge forge for myself, just some rather simple solution for having server side “central” repos so I can easily work from different laptops or workstations. So I scanned over MercurialHosting in the Mercurial wiki and every self hosting solution seemed like cracking a nut with a sledgehammer, except HGKeeper.
Introducing HGKeeper
HGKeeper introduces itself like this in its own repo:
HGKeeper is an server for mercurial repositories. It provides access control for SSH access and public HTTP access via hgweb.
It was originally designed to be run in a container but recently support has been added to run it from an existing openssh-server.
SSH and simple HTTP is all I need and running in a container suits me right, especially after I had started deploying things with Docker and Ansible and could a little more practice with that. Running in a container is especially helpful when running things implemented in those fancy new languages like Go or Rust on an old fashioned Linux like Debian. (For reference see for example Package managers all the way down on lwn about how modern languages create a dependency hell for classical Linux distributions.)
Running the HGKeeper Docker container itself was easy, however SSH access would go through a non-standard port, at least if I wanted to keep accessing the host machine through port 22.
The README promised HGKeeper can also be run together with OpenSSH running on default port. But is it possible to do both or all of this? Run in a container, access HGKeeper through port 22 and keep access to the host on the same port? I reached out to Gary Kramlich, the author of HGKeeper and that was a very nice experience. Let’s say I nerd sniped him somehow?!
Installing HGKeeper
So the goal is to run HGKeeper in a Docker container and access that through OpenSSH. While doing the setup I decided to go through an SSH server on a different machine, the one that’s exposed to the internet from my local network anyways, and where mercurial-server was installed before. So access from outside goes through OpenSSH on standard port 22 hg.example.com which is an alias for the virtual machine falbala.internal.example.com. That machine tunnels the traffic to another virtual machine miraculix.internal.example.com where HGKeeper runs on in the Docker container, with SSH port 22022 exposed to the local network.
Preparations
We follow the HGKeeper README and prepare things on the Docker host (miraculix) first. I created a directory /srv/data/hgkeeper where all related data is supposed to live. In the subfolder host-keys I created the SSH host keys as suggested in section “SSH Host Keys”:
$ ssh-keygen -t rsa -b 4096 -o host-keys/ssh_host_rsa_key
The docker container itself needs some preparation, so we run it once manually like suggested in section “Running in a Container”. The important part here is to pass the SSH public key of the client workstation you will access the HG repos first from. I copied that from my laptop to /srv/data/hgkeeper/tmp/ before. The admin username passed here (alex) should also be adapted to your needs:
cd /srv/data/hgkeeper
docker run --rm \
-v $(pwd)/repos:/repos \
-v $(pwd)/tmp/id_rsa.pub:/admin-pubkey:ro \
-e HGK_ADMIN_USERNAME=alex \
-e HGK_ADMIN_PUBKEY=/admin-pubkey \
-e HGK_REPOS_PATH=/repos \
docker.io/rwgrim/hgkeeper:latest \
hgkeeper setup
Setting up OpenSSH
As stated before, I tried to setup as much things as possible with Ansible. The preparation stuff above could probably also done with Ansible, but I had that in place from playing around and did not bother at the time. It depends on your philosophy anyways if you want to automate such sensitive tasks as creating crypto keys. However from here on I have everything in a playbook, and will show that snippets to illustrate my setup. The OpenSSH config is for the host falbala and thus in falbala.yml so see the first part here:
---
- hosts: falbala
become: true
vars:
hg_homedir: /var/lib/hg
tasks:
- name: Add system user hg
user:
name: hg
comment: Mercurial Server
group: nogroup
shell: /bin/sh
system: yes
create_home: yes
home: "{{ hg_homedir }}"
That system needs a local user. You can name it as you like, but it was named hg on my old setup and I wanted to keep that, so I don’t have to change my working copies. The user needs to have a shell set, otherwise OpenSSH won’t be able to call commands needed later. I use the $HOME dir to put the SSH known_hosts file in it, so it does not clutter my global settings. Doing this manually on Debian would look like this:
% sudo adduser --system --home /var/lib/hg --shell /bin/sh hg
Next step is that known_hosts file. You can do this manually by logging into that hg user once and do a manual connection to the SSH server on the other machine like this:
$ sudo -i -u hg
$ ssh -p 22022 hg@miraculix.internal.example.com
For Ansible I prepared a known_hosts file and that was somewhat tricky due to the different port used. You can not just look into your present files for reference, because host and port are hashed in there, and the documentation (man 8 sshd
) does not cover that part. I had to guess from ssh -v
output. The file I came up with is named pubkeys/hgkeeper in my Ansible project and it looks like this:
[miraculix.internal.example.com]:22022 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDD8hrAkg7z1ao3Hq1w/4u9Khxc4aDUfJiKfbhin0cYRY7XrNIn3mix9gwajGWlV1m0P9nyXiNTW4E/Z
W0rgTF4I1PZs3dbh66dIAH7Jif4YLFj5VPj350TF5XeytyFjalecWBa36S1Y+UydIY/o/yC104D5Hg7M27bQzL+blqk1eIlY0aaM+faEuxFYHexK5fa+Xq150F6NswHdsVPCYOKu6t+myGHpe2+X6qhVNuftDOP5J
QO6BzxhN6MmG1arZ9dkeBb6Ry++R4o3soeV1k9uZ33jbJGnqFryvL3cyOPq7mVdoSffwqef1i4+0fNTGgO8U93w2An6z5fRjvPufA+VIVvFDwRoREFKvO1Q+WdeOSUWOl6QwVjKPrv0M3QnSnTJHpZpNlshOaDZyQ
NHLXLEO43vdbGr6rk7l9ApUcF34Y7eLWp42XktQLlzDitua009v7uNBAuIzKR3+UAWaFpj+CGl1jDm7a3n8kXlJjumVN5hfXo0lLz7n+G/Yd/U87dHftL0kiYcVRR4n1qMmhV5UL4lq0FNDBwwzRzSKyNw80mRoMH
RiKBBUTFXJApzlIAXiJ7g1JThM2rcNnskpyhZSrL38ses5Ns2GBOzEZsi51U+S5O91+KwHDTb10sxoJskUvIyJxCUILkOGZpbd4uWI+6tAWycP4QMT33MUHFEQ==
With that in place, it’s straight forward in the playbook:
- name: Ensure user hg has .ssh dir
file:
path: "{{ hg_homedir }}/.ssh"
state: directory
owner: hg
group: nogroup
mode: '0700'
- name: Ensure known_hosts entry for miraculix exists
known_hosts:
path: "{{ hg_homedir }}/.ssh/known_hosts"
name: "[miraculix.internal.example.com]:22022"
key: "{{ lookup('file', 'pubkeys/hgkeeper') }}"
- name: Ensure access rights for known_hosts file
file:
path: "{{ hg_homedir }}/.ssh/known_hosts"
state: file
owner: hg
group: nogroup
For the SSH daemon configuration two things are needed. First, if you use domain names instead of IP addresses, you have to set UseDNS yes
in sshd_config. This snipped does it in Ansible:
- name: Ensure OpenSSH does remote host name resolution
lineinfile:
path: /etc/ssh/sshd_config
regexp: '^#?UseDNS'
line: 'UseDNS yes'
validate: /usr/sbin/sshd -T -f %s
backup: yes
notify:
- Restart sshd
The second, and most important part is matching the hg user, authenticating against the HGKeeper app running on the other host and tunneling the traffic through. This is done with the following snippet, which contains the literal block to be added to /etc/ssh/sshd_config if you’re doing it manually:
- name: Ensure SSH tunnel block to hgkeeper is present
blockinfile:
path: /etc/ssh/sshd_config
marker: "# {mark} ANSIBLE MANAGED BLOCK"
insertafter: EOF
validate: /usr/sbin/sshd -T -C user=hg -f %s
backup: yes
block: |
Match User hg
AuthorizedKeysCommand /usr/bin/curl -q --data-urlencode 'fp=%f' --get http://miraculix.internal.example.com:8081/hgk/authorized_keys
AuthorizedKeysCommandUser hg
PasswordAuthentication no
notify:
- Restart sshd
You might have noticed two things. curl has to be installed, and sshd should be restarted after its config has change. Here:
- name: Ensure curl is installed
package:
name: curl
state: present
handlers:
- name: Restart sshd
service:
name: sshd
state: reloaded
Running HGKeeper Docker Container with Ansible
Running a Docker container from Ansible is quite easy. I just translated the call to docker run and its arguments from the HGKeeper documentation:
---
- hosts: miraculix
become: true
vars:
data_basedir: /srv/data
hgkeeper_data: "{{ data_basedir }}/hgkeeper"
hgkeeper_host_keys: host-keys
hgkeeper_repos: repos
hgkeeper_ssh_port: "22022"
tasks:
- name: Setup docker container for HGKeeper
docker_container:
name: hgkeeper
image: "docker.io/rwgrim/hgkeeper:latest"
pull: true
state: started
detach: true
restart_policy: unless-stopped
volumes:
- "{{ hgkeeper_data }}/{{ hgkeeper_host_keys }}:/{{ hgkeeper_host_keys }}:ro"
- "{{ hgkeeper_data }}/{{ hgkeeper_repos }}:/{{ hgkeeper_repos }}"
env:
HGK_SSH_HOST_KEYS: "/{{ hgkeeper_host_keys }}"
HGK_REPOS_PATH: "/{{ hgkeeper_repos }}"
HGK_EXTERNAL_HOSTNAME: "miraculix.internal.example.com"
HGK_EXTERNAL_PORT: "{{ hgkeeper_ssh_port }}"
ports:
- "8081:8080" # http
- "{{ hgkeeper_ssh_port }}:22222" # ssh
command: hgkeeper serve
Client Settings
Almost done, after trivially copying over my old repos from the old virtual machine to the new, the server is ready. For the laptops and workstations nothing in my setup has to be changed, but one thing. The new setup needs Agent Forwarding in the SSH client config. But is simple, see the lines I added to ~/.ssh/config here:
Host hg.example.com hg
HostName hg.example.com
ForwardAgent yes
After all this was a pleasant endeavor in both working on the project itself as well as the outcome I have now.