Recovering from git push fuckup

For some of my projects the central Git server is hosted with gitolite for a long time now. Some time ago I forked a repo bss (example name) to a new repo gee, where both should keep the history of bss, but head in a different direction, including different tags. While doing changes on both repos in different branches I still need to merge an old branch of bss to the recent branch of gee, but usually not in the other direction.

To not accidentally break things, my working copies of both projects only have one remote, the one designated for the original or the fork. For merge or cherry-pick operations in one or the other directions I have a special working copy with both remotes configured. Today after a cherry-pick from gee to bss I messed up pushing, and ended up with gee/master pushed to bss/master, which was wrong as hell, and cold sweat appeared on my head. Damn.

So I had to rewind bss/master to a known good state, but on the server.
WARNING: Make 100% sure you know what you do here to prevent DATA LOSS or your colleagues being mad at you or both.

First try was with the obvious git push --force which was denied by gitolite right away. If you do not know why force pushing usually is a bad thing, stop reading here and learn why it is a bad thing.

Did I warn you about possible data loss?

Second try usually works with branches which are not the default branch, like this (assuming I messed up origin/mybranch and want to push a different version):

$ git push --delete origin mybranch
$ git push --set-upstream origin mybranch

This won’t work with the default branch however with this error message:

$ git push --delete origin master
remote: error: By default, deleting the current branch is denied, because the next
remote: 'git clone' won't result in any file checked out, causing confusion.
remote:
remote: You can set 'receive.denyDeleteCurrent' configuration variable to
remote: 'warn' or 'ignore' in the remote repository to allow deleting the
remote: current branch, with or without a warning message.
remote:
remote: To squelch this message, you can set it to 'refuse'.
remote: error: refusing to delete the current branch: refs/heads/master
To ssh://git.example.com/bss
 ! [remote rejected]   master (deletion of the current branch prohibited)
Fehler: Fehler beim Versenden einiger Referenzen nach 'ssh://git.example.com/bss'

This is not a gitolite problem, but Git itself complains here. Why? For bare repositories Git has a concept of a default branch. This is the branch you get when cloning a repo without specifying a branch. It is marked by the ref named HEAD. Now the idea to recover is like this:

  1. create a new temporary branch on the same changeset you want to have as master on the server
  2. push that branch to the server
  3. change the default branch on the server
  4. do the push –delete / push dance from above (with master now not being the default branch anymore)
  5. change the server’s default branch back
  6. delete the temporary branch both local and remote

In recent Git changing the default branch is done with git symbolic-ref and that would have to be executed on the gitolite server.

Thankfully on recent gitolite V3 you can execute that through SSH, and I learned this from stackoverflow. These were the commands I used (including output), with main being my temporary branch (see above):

$ ssh git@git.example.com symbolic-ref bss HEAD
refs/heads/master
$ ssh git@git.example.com symbolic-ref bss HEAD refs/heads/main
$ ssh git@git.example.com symbolic-ref bss HEAD                
refs/heads/main

Now this is the time to delete master on the remote and push it again from a known good state. Then switch back the default branch.

$ ssh git@git.example.com symbolic-ref bss HEAD refs/heads/master
$ ssh git@git.example.com symbolic-ref bss HEAD                  
refs/heads/master

Everything fine again now. Wrote this for my future self.

Remote debugging with KDevelop and ptxdist

At work we use ptxdist for building firmware images for embedded devices. Our own software is built with CMake and my personal Linux desktop of choice is KDE with KDevelop as Integrated Development Environment. This is nothing new and I wrote about different aspects of that in the past. Today it’s about remote debugging.

Introduction

Writing C/C++ software and building it with CMake is well integrated in KDevelop. Running and debugging things on the same host (your personal computer or laptop) from within KDevelop with gdb is smooth. Things get interesting if you do cross compiling, and the software runs on some embedded target which has not enough space for debugger plus debug symbols plus source fragments. You could use NFS, but every now and then we find ourselves in the position where we need to debug something on devices deployed in the field. What you usually do in this case is installing gdbserver on the target and connect to it from the gdb on your host computer.

The command line way

Remote debugging on the command line is easy from within your ptxdist based Board Support Package (BSP) if you are comfortable enough with gdb or cgdb on the shell. The process is basically as described in HowTo: Debug core dumps with ptxdist 2017.07.0 already, you call ptxdist with the argument gdb. For remote debugging you would do this:

  1. Build gdbserver for your target by selecting it from the menu (like with your other target packages)
  2. Start gdbserver on the target e.g. like this:
    gdbserver 192.168.10.184:12345 mytool --my-arguments
  3. Start gdb on the host from your BSP workspace:
    ptxdist gdb
  4. In the gdb command line connect to gdbserver like this:
    target remote 192.168.10.184:12345
  5. Use gdb as usual (set breakpoints, run, inspect, etc.)

The magic of choosing the correct gdb suitable for the target and setting the necessary options for finding debug symbols and getting paths right is done by ptxdist in step 3. Maybe all this could go to ptxdist’s documentation, but it’s more or less straightforward for experienced developers.

Using KDevelop to edit CMake based projects in your ptxdist BSP

Before we come to debugging, we start with importing a CMake project from the BSP into KDevelop. Please make sure everything builds fine first, a ptxdist go should be completed without errors to have everything in place.

For demonstration I’ll import mosquitto, because it’s a CMake project. I first enabled PTXCONF_MOSQUITTO and let ptxdist built it once.

Then in KDevelop I choose Open / Import Project … and a dialog pops up. The project file I select is /path/to/my/BSP/platform-xyz/built-target/mosquitto-2.0.5/CMakeLists.txt and I usually just select the main CMakeLists.txt of a project here.

CMake projects are usually built out-of-tree, so the source file folders are not clobbered with build artefacts and ptxdist does OOT builds too. In the next dialog KDevelop asks for a build directory, but the suggested one is not the one ptxdist uses. We should select the one from ptxdist however, which in this case is /path/to/my/BSP/platform-xyz/built-target/mosquitto-2.0.5-build and it’s always directly next to the source dir in ptxdist just with -build appended. In that dialog KDevelop tells us now: „Using an already created build directory.“ and that’s exactly what we want here, because ptxdist already set this up with the necessary cross-compiling options.

You could use that to actually code in that source tree now, and even let KDevelop build it locally. That’s a whole other can of worms however and out of scope of what we actually wanna show in this blog post (remote debugging).

Remote debugging with KDevelop

If you never used gdb from within KDevelop before, you should read Running programs in KDevelop and Debugging programs in KDevelop from the KDevelop documentation first.

So, create a Launch Configuration, name it as you like it, instead of using some project target as executable, select „Executable“ and use the binary from the BSP root folder, e.g. this in case of mosquitto_pub (we just use this as example now): /path/to/my/BSP/platform-xyz/root/usr/bin/mosquitto_pub

Then go to the Debug dialog of that Launch Configuration which should look like this somehow:

Screenshot of KDevelop Launch Configurations dialog

As Debugger executable we select the one from the toolchain. Easiest is to click through to your BSP folder, then further to platform-xyz/selected_toolchain and from there choose the gdb. In my case I have
/path/to/my/BSP/platform-xyz/selected_toolchain/arm-v7a-linux-gnueabihf-gdb
here, which actually is in
/opt/OSELAS.Toolchain-2020.08.0/arm-v7a-linux-gnueabihf/gcc-10.2.1-clang-10.0.1-glibc-2.32-b
inutils-2.35-kernel-5.8-sanitized/bin

but it’s a lot easier to use that selected_toolchain symlink here.

Up to here that was the „easy“ part. The main thing I struggled with a lot comes with those three scripts for Remote Debugging seen in the lower half of the dialog. You put those in three different files and those can reside anywhere you like. Let’s dive in.

Gdb config script

The Gdb config script is used to setup gdb and you can think of it like things done before gdb is started. The KDevelop help if you hover over it says: „This script is sourced by gdb when the debugging starts.“ We will put the same in here ptxdist uses when calling gdb, but have to add some more. This is essential. If you don’t get this right, things will not work as expected.

set debug-file-directory /path/to/my/BSP/platform-xyz/root/usr/lib/debug
add-auto-load-safe-path /path/to/my/BSP/platform-xyz/root
set sysroot /path/to/my/BSP/platform-xyz/root
set substitute-path platform-xyz /path/to/my/BSP/platform-xyz
directory /path/to/my/BSP

The first four lines do the same what ptxdist does when calling it with the gdb argument. The last line is exceptionally important, because KDevelop is not started from the BSP workspace directory as it’s the case when calling ptxdist gdb. This is essential however, because since ptxdist-2018.10.0 you only have paths relative to your BSP in the debug symbols. If you don’t add that extra directory gdb started by KDevelop won’t find the source files and breakpoints won’t work. (You can set the breakpoints in KDevelop, but gdb does not find the source files as communicated by KDevelop. Result is: breakpoints stay pending and execution is not interrupted. If you set the breakpoints manually through the GDB console, execution is interrupted, but KDevelop does not jump to the correct source lines. Both not satisfying.)

(Debugging without that directory line probably works when calling ptxdist gdb from the BSP workspace folder, because gdb has some paths added automatically. See Specifying Source Directories from the gdb documentation and especially the section about two special entries ‘$cdir’ and ‘$cwd’.)

Run shell script

The Run shell script can be used to let KDevelop start the application on the remote target. This works with ssh for example, but is only smooth if that connection can be made without typing a password. One example to put into such a script:

ssh root@192.168.10.185 'gdbserver 192.168.10.185:12345' /usr/bin/mosquitto_pub -h 192.168.10.74 -t 'hello' -m 'world'

Consider this optional. You might as well start gdbserver on the target manually with whatever options you need. (This is different for example if you want to attach to an already running process.)

Run gdb script

This is executed by KDevelop on your behalf, usually to connect to the remote gdbserver. You could as well type it into the GDB console in KDevelop. This is what you could put in such a script:

shell sleep 3
target remote 192.168.10.185:12345

Aftermath

The whole thing caused me some headaches, especially because things changed when ptxdist introduced layers back in late 2018. However it’s a lot more comfortable to debug in the same environment as you code and especially inspection of frame stacks and variables is much more comfortable in a graphical environment than on the console. I think it’s worth the effort and maybe this HowTo helps someone else to use a similar setup.

You can of course create those three files manually. What we did at work was writing a shell script creating those from some template, so we can recreate them again with different settings for hosts, binary paths, etc. This is not complicated however, you can do that by yourself.

Building multiple inter-dependent autotools based projects

One of the main reasons for this blog is documenting things for my future self. This post is one of these.

Building a HTTP REST API with C++ is a quite unique type of challenge, but doing it with libhttpserver takes a lot of the lower level protocol (read: HTTP and below) burden away from you, and lets you focus on the content part of your problem. This is what we did for a new project at work lately. But sometimes when working with external dependencies you hit some bugs and corner cases and have to dig into those projects.

Building libhttpserver is almost straight forward, if you had already built autotools based projects before. You’ll notice however it depends on a quite recent version of libmicrohttpd. So the task for hacking on libhttpserver is to build both libraries from source, without interfering with the rest of your system.

The usual way to build autotools based projects is like this:

% ./configure
% make

(It works exactly like this if building from an extracted source tarball. There’s another step in front of those, if you build from a Git working copy.)

Now after the naïve way to build libmicrohttpd, how can we tell libhttpserver to pick that up as a dependency? What I do is “installing” them both to the same place in a tree just for this purpose, and I use the --prefix option of ./configure for that with a directory I created before. So for libmicrohttpd from Git it would look like this:

% ./bootstrap
% mkdir -p build && cd build
% ../configure --prefix=/home/adahl/build/sysroot/http
% make
% make install

After that I have the following file tree:

% tree ~/build/sysroot/http
/home/adahl/build/sysroot/http
├── include
│   └── microhttpd.h
├── lib
│   ├── libmicrohttpd.a
│   ├── libmicrohttpd.la
│   ├── libmicrohttpd.so -> libmicrohttpd.so.12.60.0
│   ├── libmicrohttpd.so.12 -> libmicrohttpd.so.12.60.0
│   ├── libmicrohttpd.so.12.60.0
│   └── pkgconfig
│       └── libmicrohttpd.pc
└── share
    ├── info
    │   ├── dir
    │   ├── libmicrohttpd.info
    │   ├── libmicrohttpd_performance_data.png
    │   └── libmicrohttpd-tutorial.info
    └── man
        └── man3
            └── libmicrohttpd.3

7 directories, 12 files

So let’s try to build libhttpserver from Git master now:

% ./bootstrap
% mkdir -p build && cd build
% ../configure --prefix=/home/adahl/build/sysroot/http

But that gives:

checking for microhttpd.h... no
configure: error: "microhttpd.h not found"

So just giving the prefix is not enough. We need to pass some directories to preprocessor (for finding header files) and linker (to link against the other shared lib). You can do it like this:

% CPPFLAGS=-I/home/adahl/build/sysroot/http/include LDFLAGS=-L/home/adahl/build/sysroot/http/lib ../configure --prefix=/home/adahl/build/sysroot/http

Looking complicated? I bet. In our case we have only two projects, but what if there are even more? Gnu autotools has a nice feature which can help here though: Overriding Default Configuration Setting with config.site. In short: Put those preprocessor, compiler, and linker flags into a special file in your install tree, and ./configure will pick it up automatically and use the settings. In my case, I create a file /home/adahl/build/sysroot/http/share/config.site and put in the following:

test -z "$LDFLAGS" && LDFLAGS=-L/home/adahl/build/sysroot/http/lib
test -z "$CPPFLAGS" && CPPFLAGS=-I/home/adahl/build/sysroot/http/include

We can omit that stuff when calling ./configure now and everything is found and compiled and linked together. We can tell it’s picked up by the very first line of ./configure output:

% ../configure --prefix=/home/adahl/build/sysroot/http  
configure: loading site script /home/adahl/build/sysroot/http/share/config.site
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
…

So that’s all for today, I hope it will help my future self to look up what that magic file is called, where it must be placed, and where the documentation for that can be found.

Update: I make use of this mechanism in some of my CMake superbuild projects, e.g. in glowing-tribble-build.

Moving to HGKeeper

Motivation

So, where do we come from? I started using the Mercurial version control system around 2009 if I remember correctly. Had used Subversion and SVK (does anyone remember that?) before and was curious about distributed version control. Back then Mercurial was better suited for platform independent use on Linux and Windows, and I still had to use the latter at work. Mercurial’s user interface was very much the same as Subversion’s and basically just push and pull were added, switching from Subversion to Mercurial was easy. Git was weird at the time, at least for me. Meanwhile we use Git at work exclusively and it got a lot better on Windows over time. However for nostalgic reasons and to stay somewhat fluent in other VCS I kept most of my private projects on Mercurial.

For “hosting” my own repos I used the Debian package mercurial-server, at least up to Debian 9 (stretch), but after upgrading that server to Debian 10 (buster) things started falling apart, and I looked out for a new hosting solution. For the record: I thought about converting all those repos to Git, but opted for not doing that, because I have accumulated quite a number of repos and although I did convert one or two already, I supposed it would be easier to switch the hosting instead of each repo.

Speaking of hosting: I don’t need a huge forge for myself, just some rather simple solution for having server side “central” repos so I can easily work from different laptops or workstations. So I scanned over MercurialHosting in the Mercurial wiki and every self hosting solution seemed like cracking a nut with a sledgehammer, except HGKeeper.

Introducing HGKeeper

HGKeeper introduces itself like this in its own repo:

HGKeeper is an server for mercurial repositories. It provides access control for SSH access and public HTTP access via hgweb.

It was originally designed to be run in a container but recently support has been added to run it from an existing openssh-server.

SSH and simple HTTP is all I need and running in a container suits me right, especially after I had started deploying things with Docker and Ansible and could a little more practice with that. Running in a container is especially helpful when running things implemented in those fancy new languages like Go or Rust on an old fashioned Linux like Debian. (For reference see for example Package managers all the way down on lwn about how modern languages create a dependency hell for classical Linux distributions.)

Running the HGKeeper Docker container itself was easy, however SSH access would go through a non-standard port, at least if I wanted to keep accessing the host machine through port 22.

The README promised HGKeeper can also be run together with OpenSSH running on default port. But is it possible to do both or all of this? Run in a container, access HGKeeper through port 22 and keep access to the host on the same port? I reached out to Gary Kramlich, the author of HGKeeper and that was a very nice experience. Let’s say I nerd sniped him somehow?!

Installing HGKeeper

So the goal is to run HGKeeper in a Docker container and access that through OpenSSH. While doing the setup I decided to go through an SSH server on a different machine, the one that’s exposed to the internet from my local network anyways, and where mercurial-server was installed before. So access from outside goes through OpenSSH on standard port 22 hg.example.com which is an alias for the virtual machine falbala.internal.example.com. That machine tunnels the traffic to another virtual machine miraculix.internal.example.com where HGKeeper runs on in the Docker container, with SSH port 22022 exposed to the local network.

Preparations

We follow the HGKeeper README and prepare things on the Docker host (miraculix) first. I created a directory /srv/data/hgkeeper where all related data is supposed to live. In the subfolder host-keys I created the SSH host keys as suggested in section “SSH Host Keys”:

$ ssh-keygen -t rsa -b 4096 -o host-keys/ssh_host_rsa_key

The docker container itself needs some preparation, so we run it once manually like suggested in section “Running in a Container”. The important part here is to pass the SSH public key of the client workstation you will access the HG repos first from. I copied that from my laptop to /srv/data/hgkeeper/tmp/ before. The admin username passed here (alex) should also be adapted to your needs:

cd /srv/data/hgkeeper
docker run --rm \
    -v $(pwd)/repos:/repos \
    -v $(pwd)/tmp/id_rsa.pub:/admin-pubkey:ro \
    -e HGK_ADMIN_USERNAME=alex \
    -e HGK_ADMIN_PUBKEY=/admin-pubkey \
    -e HGK_REPOS_PATH=/repos \
    docker.io/rwgrim/hgkeeper:latest \
    hgkeeper setup

Setting up OpenSSH

As stated before, I tried to setup as much things as possible with Ansible. The preparation stuff above could probably also done with Ansible, but I had that in place from playing around and did not bother at the time. It depends on your philosophy anyways if you want to automate such sensitive tasks as creating crypto keys. However from here on I have everything in a playbook, and will show that snippets to illustrate my setup. The OpenSSH config is for the host falbala and thus in falbala.yml so see the first part here:

---
- hosts: falbala
 become: true

 vars:
   hg_homedir: /var/lib/hg

 tasks:
   - name: Add system user hg
     user:
       name: hg
       comment: Mercurial Server
       group: nogroup
       shell: /bin/sh
       system: yes
       create_home: yes
       home: "{{ hg_homedir }}"

That system needs a local user. You can name it as you like, but it was named hg on my old setup and I wanted to keep that, so I don’t have to change my working copies. The user needs to have a shell set, otherwise OpenSSH won’t be able to call commands needed later. I use the $HOME dir to put the SSH known_hosts file in it, so it does not clutter my global settings. Doing this manually on Debian would look like this:

% sudo adduser --system --home /var/lib/hg --shell /bin/sh hg

Next step is that known_hosts file. You can do this manually by logging into that hg user once and do a manual connection to the SSH server on the other machine like this:

$ sudo -i -u hg
$ ssh -p 22022 hg@miraculix.internal.example.com

For Ansible I prepared a known_hosts file and that was somewhat tricky due to the different port used. You can not just look into your present files for reference, because host and port are hashed in there, and the documentation (man 8 sshd) does not cover that part. I had to guess from ssh -v output. The file I came up with is named pubkeys/hgkeeper in my Ansible project and it looks like this:

[miraculix.internal.example.com]:22022 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDD8hrAkg7z1ao3Hq1w/4u9Khxc4aDUfJiKfbhin0cYRY7XrNIn3mix9gwajGWlV1m0P9nyXiNTW4E/Z
W0rgTF4I1PZs3dbh66dIAH7Jif4YLFj5VPj350TF5XeytyFjalecWBa36S1Y+UydIY/o/yC104D5Hg7M27bQzL+blqk1eIlY0aaM+faEuxFYHexK5fa+Xq150F6NswHdsVPCYOKu6t+myGHpe2+X6qhVNuftDOP5J
QO6BzxhN6MmG1arZ9dkeBb6Ry++R4o3soeV1k9uZ33jbJGnqFryvL3cyOPq7mVdoSffwqef1i4+0fNTGgO8U93w2An6z5fRjvPufA+VIVvFDwRoREFKvO1Q+WdeOSUWOl6QwVjKPrv0M3QnSnTJHpZpNlshOaDZyQ
NHLXLEO43vdbGr6rk7l9ApUcF34Y7eLWp42XktQLlzDitua009v7uNBAuIzKR3+UAWaFpj+CGl1jDm7a3n8kXlJjumVN5hfXo0lLz7n+G/Yd/U87dHftL0kiYcVRR4n1qMmhV5UL4lq0FNDBwwzRzSKyNw80mRoMH
RiKBBUTFXJApzlIAXiJ7g1JThM2rcNnskpyhZSrL38ses5Ns2GBOzEZsi51U+S5O91+KwHDTb10sxoJskUvIyJxCUILkOGZpbd4uWI+6tAWycP4QMT33MUHFEQ==

With that in place, it’s straight forward in the playbook:

   - name: Ensure user hg has .ssh dir
     file:
       path: "{{ hg_homedir }}/.ssh"
       state: directory
       owner: hg
       group: nogroup
       mode: '0700'

   - name: Ensure known_hosts entry for miraculix exists
     known_hosts:
       path: "{{ hg_homedir }}/.ssh/known_hosts"
       name: "[miraculix.internal.example.com]:22022"
       key: "{{ lookup('file', 'pubkeys/hgkeeper') }}"

   - name: Ensure access rights for known_hosts file
     file:
       path: "{{ hg_homedir }}/.ssh/known_hosts"
       state: file
       owner: hg
       group: nogroup

For the SSH daemon configuration two things are needed. First, if you use domain names instead of IP addresses, you have to set UseDNS yes in sshd_config. This snipped does it in Ansible:

   - name: Ensure OpenSSH does remote host name resolution
     lineinfile:
       path: /etc/ssh/sshd_config
       regexp: '^#?UseDNS'
       line: 'UseDNS yes'
       validate: /usr/sbin/sshd -T -f %s
       backup: yes
     notify:
       - Restart sshd

The second, and most important part is matching the hg user, authenticating against the HGKeeper app running on the other host and tunneling the traffic through. This is done with the following snippet, which contains the literal block to be added to /etc/ssh/sshd_config if you’re doing it manually:

   - name: Ensure SSH tunnel block to hgkeeper is present
     blockinfile:
       path: /etc/ssh/sshd_config
       marker: "# {mark} ANSIBLE MANAGED BLOCK"
       insertafter: EOF
       validate: /usr/sbin/sshd -T -C user=hg -f %s
       backup: yes
       block: |
         Match User hg
             AuthorizedKeysCommand /usr/bin/curl -q --data-urlencode 'fp=%f' --get http://miraculix.internal.example.com:8081/hgk/authorized_keys
             AuthorizedKeysCommandUser hg
             PasswordAuthentication no
     notify:
       - Restart sshd

You might have noticed two things. curl has to be installed, and sshd should be restarted after its config has change. Here:

   - name: Ensure curl is installed
     package:
       name: curl
       state: present

  handlers:
    - name: Restart sshd
      service:
        name: sshd
        state: reloaded

Running HGKeeper Docker Container with Ansible

Running a Docker container from Ansible is quite easy. I just translated the call to docker run and its arguments from the HGKeeper documentation:

---
- hosts: miraculix
  become: true

  vars:
    data_basedir: /srv/data
    hgkeeper_data: "{{ data_basedir }}/hgkeeper"
    hgkeeper_host_keys: host-keys
    hgkeeper_repos: repos
    hgkeeper_ssh_port: "22022"

  tasks:
   - name: Setup docker container for HGKeeper
     docker_container:
       name: hgkeeper
       image: "docker.io/rwgrim/hgkeeper:latest"
       pull: true
       state: started
       detach: true
       restart_policy: unless-stopped
       volumes:
         - "{{ hgkeeper_data }}/{{ hgkeeper_host_keys }}:/{{ hgkeeper_host_keys }}:ro"
         - "{{ hgkeeper_data }}/{{ hgkeeper_repos }}:/{{ hgkeeper_repos }}"
       env:
         HGK_SSH_HOST_KEYS: "/{{ hgkeeper_host_keys }}"
         HGK_REPOS_PATH: "/{{ hgkeeper_repos }}"
         HGK_EXTERNAL_HOSTNAME: "miraculix.internal.example.com"
         HGK_EXTERNAL_PORT: "{{ hgkeeper_ssh_port }}"
       ports:
         - "8081:8080"   # http
         - "{{ hgkeeper_ssh_port }}:22222"   # ssh
       command: hgkeeper serve

Client Settings

Almost done, after trivially copying over my old repos from the old virtual machine to the new, the server is ready. For the laptops and workstations nothing in my setup has to be changed, but one thing. The new setup needs Agent Forwarding in the SSH client config. But is simple, see the lines I added to ~/.ssh/config here:

Host hg.example.com hg
HostName hg.example.com
ForwardAgent yes

After all this was a pleasant endeavor in both working on the project itself as well as the outcome I have now.

Troubleshooting: Firefox Hangs when Connecting to Embedded Devices with Self Signed Certificates

As you might know my day job is about embedded Linux devices and as for all modern network connected devices you want TLS encrypted connections to those – which leads to all kinds of problems, especially with certificates. What we do here: let each devices create its own self signed certificate and teach the users to ignore the warnings in the webbrowsers.1

So almost all of our devices share the same CommonName (CN) and Issuer, and when developing I connect to numerous of them over time. So far no big deal, however Mozilla Firefox does not like all those similar but not equal certificates. It gets slow over time, really slow, so slow connections to the devices run into timeouts. Other sites are not affected however.

My colleague ran into this last year, my browser was not affected until today. What he did: start over with a new user profile in Firefox. Works, but all your addons and customizations are gone. You could of course also use another browser. Well, I did not want that, so I searched the web for a cause or even a solution.

I found that blog post Troubleshoot Firefox’s “Performing TLS Handshake” Message and it gave me a hint for a working workaround. What I did, deleted those two files from my profile (while Firefox was not running):

  • cert8.db
  • cert9.db

That did it, connections to my devices are nice and fast again. Have to search for the bugreport at the Mozilla Firefox project next …

Update: I tried to find some matching bug reports, and found some.

  1. Yes, we are aware of the problems, it’s complicated … []

Running EAGLE 9.6 on Debian 10 (buster)

For some side project I wanted to look at the schematics of the Adafruit PowerBoost 500C, which happened to be made with Autodesk EAGLE.

Having run EAGLE on Debian 9 (stretch) for a while now without great hassle, I did not expect much difficulties, I was wrong. First I downloaded the tarball from their download site. Don’t worry, there’s still the “free” version for hobbyists, however it’s not Free Software, but precompiled binaries for amd64 architecture, better than nothing.

After extracting, I tried to start it like this and got the first error:

alex@lemmy ~/opt/eagle-9.6.0 % ./eagle
terminate called after throwing an instance of 'std::runtime_error'
  what():  locale::facet::_S_create_c_locale name not valid
[1]    20775 abort      ./eagle

There’s no verbose or debug option. And according to the comments on the blog post “How to Install Autodesk EAGLE On Windows, Mac and Linux” the problem also affects other users. I vaguely remembered somewhere deep in the back of my head, I already had this problem some time ago on another machine, and tried something not obvious at all. My system locale is German, looked like this before:

alex@lemmy ~ % locale
LANG=de_DE.UTF-8
LANGUAGE=
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=

alex@lemmy ~ % locale -a
C
C.UTF-8
de_DE.utf8
POSIX

As you can see, no English locale, so I added one. On Debian you do it like this. Result below:

% sudo dpkg-reconfigure locales

alex@lemmy ~/opt/eagle-9.6.0 % locale -a
C
C.UTF-8
de_DE.utf8
en_US.utf8
POSIX

This seems like a typical “works on my machine” problem from an US developer, huh? Next try starting eagle, you’ll see the locale problem is gone, the next error appears:

alex@lemmy ~/opt/eagle-9.6.0 % ./eagle 
./eagle: symbol lookup error: /lib/x86_64-linux-gnu/libGLX_mesa.so.0: undefined symbol: xcb_dri3_get_supported_modifiers

That’s the problem if you don’t build from source against the libraries of the system. In that case EAGLE ships a shared object libxcb-dri3.so.0, which itself is linked against the libxcb.so.1 from my host system, and those don’t play well together. I found a solution to that in a forum thread “Can’t run EAGLE on Debian 10 (testing)” and it is using the ld_preload trick:

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libxcb-dri3.so.0 ./eagle

This works. Hallelujah.

Performance Analysis on Embedded Linux with perf and hotspot

This is about profiling your applications on your embedded Linux target or let’s say finding the spots of high CPU usage, which is a common concern in practice. For an extensive overview see Linux Performance by Brendan Gregg. We will focus on viewing flame graphs with a tool called hotspot here, based on performance data recorded with perf. This proved to be helpful enough to solve most of the performance issues I had lately.

Installing the Tools

We have two sides here: the embedded Linux target and your Linux workstation host. For your computer you need to install hotspot. In Debian it is available from version 10 (buster). You can build from source of course, I did that with Debian 9 (stretch) a while ago. IIRC there are instructions for that upstream. Or you build it from the deb-src package from Debian unstable (sid) by following this BuildingTutorial.1

The embedded target part needs basically two parts. You have to set some options in the kernel config and you need the userland tool perf. For ptxdist here’s what I did:

  • Add -fno-omit-frame-pointer to global CFLAGS and CXXFLAGS
  • Enable PTXCONF_KERNEL_TOOL_PERF

Note: I had to update my kernel from v4.9 to v4.14, otherwise I got build errors when building perf.

Configuring the Kernel

I won’t quote the whole kernel config here, but I have a diff on what I had to set to make perf record useful things. These options are probably important, at least I had those on in my debug sessions (others might also be needed):

  • CONFIG_KALLSYMS
  • CONFIG_PERF_EVENTS
  • CONFIG_UPROBES
  • CONFIG_STACKTRACE
  • CONFIG_FTRACE

Using it

For embedded use, I basically followed the instructions of upstream hotspot. You might however want to dive a little into the options of perf, because it is a very powerful tool. What I did to record on the target was basically this to get samples from my daemon application mydaemon for 30 seconds:

perf record --call-graph dwarf --pid=$(pgrep mydaemon) sleep 30

This can produce quite a lot of data, so use it with short times first to not fill your filesystem. Luckily I had enough space on the flash memory of the embedded target available. Then just follow what the hotspot README says: copy the file and your kernel symbols to your host and call hotspot with the right options to your sysroot. This was the call I used (from a subfolder of my ptxdist BSP, where I copied those files to):

hotspot --sysroot ../../../platform-ncl/root --kallsyms kallsyms perf.data

Happy performance analysing!

  1. I did not test that with hotspot []

Troubleshooting nextcloud running on lighttpd

For historic reasons my personal nextcloud runs on lighttpd. (In fact I started in 2014 with ownCloud on Debian 8 (Jessie) and never changed the webserver.) This works, although it is not a first citizen supported platform, but after all nextcloud just needs some PHP/MySQL, so the webserver should not matter, should it?

After upgrading to nextcloud 15 and installing the social app/plugin, I got some warnings about missing rewrites/redirects. So it seems there are some new rewrites for service discovery in town, you can learn about those on the troubleshooting page of the administration manual.

You get config snippets for Apache and nginx (which are by far the most used webservers, I don’t blame nextcloud for not giving advise on every exotic webserver out there). For lighttpd you have to do it on your own and this is what I came up with. Load mod_redirect and mod_rewrite in your main lighttpd.conf and for the nextcloud config add this:

        # for social app
        # TODO  After upgrade to lighttpd > 1.4.50 revisit
        #       https://redmine.lighttpd.net/projects/1/wiki/docs_modrewrite
        url.rewrite-once = (
                # Match when there are no query variables
                "^/.well-known/webfinger$" => "/public.php?service=webfinger",
                # Match query variables and append with &
                "^/.well-known/webfinger\?(.*)$" => "/public.php?service=webfinger&$1"
        )

        url.redirect = (
                "^/.well-known/caldav$" => "/remote.php/dav",
                "^/.well-known/carddav$" => "/remote.php/dav"
        )

If you installed in a subfolder (the subfolder is still called owncloud here, adapt if needed):

        # for social app
        # TODO  After upgrade to lighttpd > 1.4.50 revisit
        #       https://redmine.lighttpd.net/projects/1/wiki/docs_modrewrite
        url.rewrite-once = (
                # Match when there are no query variables
                "^/owncloud/.well-known/webfinger$" => "/owncloud/public.php?service=webfinger",
                # Match query variables and append with &
                "^/owncloud/.well-known/webfinger\?(.*)$" => "/owncloud/public.php?service=webfinger&$1"
        )

        url.redirect = (
                "^/owncloud/.well-known/caldav$" => "/owncloud/remote.php/dav",
                "^/owncloud/.well-known/carddav$" => "/owncloud/remote.php/dav"
        )

For the future I consider changing the webserver, because lighttpd is not recommended by SabreDAV, which is used by nextcloud. But that’s not decided yet, for today this is it.

Wireshark USB capture setup with groups and udev

Wireshark does not only capture network traffic, but also different things like USB traffic. I needed that today and it needs some additional setup on Linux. There’s something in the Wireshark wiki on that topic, but I consider that not an elegant solution: USB capture setup.

The solution I use is basically one proposed on stackoverflow and uses a separate Linux system group and udev: usbmon (wireshark, tshark) for regular user.

On Debian you can do this:

addgroup usbmon
addgroup adahl usbmon

You have to log off and on again, check if you are in that group with the command id.

Now create a new file /etc/udev/rules.d/75-usbmon.rules and put this into it:

SUBSYSTEM=="usbmon", GROUP="usbmon", MODE="640"

After doing modprobe usbmon your devices /dev/usbmon* should belong to the new usbmon group and you can start capturing things with Wireshark.

SSH Remote Capture mit Wireshark

Manchmal will man ja mal auf einem entfernten Gerät oder einem Embedded Board den Netzwerktraffic beobachten. Auf meinem PC würde ich dazu Wireshark nehmen, das läuft so natürlich nicht auf einem Gerät ohne grafische Oberfläche. Üblicherweise kommt dort tcpdump zum Einsatz, dann allerdings mit sehr viel weniger Komfort als man von Wireshark gewohnt ist.

Einen Tipp für eine komfortablere Variante gab es im RFC-Podcast Folge 12: RFCE012: IP Routing III + in eigener Sache.

Man kann nämlich den Traffic auf dem fraglichen Gerät aufzeichnen und live zu Wireshark auf dem PC rüber schubsen, bspw. mit SSH.

Eine übliche Variante, die schon seit langem funktioniert, ist folgende. Man startet auf dem PC in der Konsole dies hier:

ssh root@192.168.10.113 "/usr/sbin/tcpdump -i eth0 -U -w - 'not (host 192.168.10.62 and port 22)'" | wireshark -i - -k

Was bedeutet das? Es wird auf dem PC der SSH-Client aufgerufen und angewiesen sich als Nutzer ‘root’ mit dem Rechner auf der IP 192.168.10.113 zu verbinden. Dort soll er dann das Programm tcpdump mit gewissen Optionen starten. Ich rufe hier tcpdump mit vollem Pfad auf, weil es sein kann, dass es sonst nicht gefunden wird. Den Output davon bekomme ich im PC auf stdout und pipe den dann nach Wireshark, die entsprechenden Optionen bei dessen Aufruf bewirken, dass er die Daten auch verarbeiten kann und damit sofort loslegt.

Bisschen knifflig sind die Optionen von tcpdump, daher im einzelnen:

-i interface

Netzwerk-Schnittstelle, auf der tcpdump lauschen soll

-U
“Packet buffered” Output, d.h. tcpdump sammelt nicht, sondern schickt den Output pro Paket raus
-w file
tcpdump schreibt keinen lesbaren Output auf die Konsole, sondern ein maschinenlesbares Format in eine Datei, in diesem Fall nach stdout
filter expression
Hier will man den Traffic der SSH-Verbindung selbst natürlich ausfiltern, vorsichtig sein mit den Klammern, die müssen entweder escapet werden oder man schließt den Filterausdruck in Hochkommata ein.

Soweit so gut. In neueren Versionen von Wireshark gibt es noch die Variante, das direkt aus dem GUI heraus aufzurufen. Die entsprechenden Optionen sind leider nicht so super dokumentiert. Was für mich funktioniert, sieht man im folgenden Screenshot:

Screenshot Wireshark SSH Remote Capture Setup

Einige leere Optionsfelder sehen so aus wie Argumente für tcpdump, die scheint Wireshark zumindest in der hier gerade eingesetzten Version v2.6.3 nicht zu berücksichtigen, daher auch Remote Interface und Remote Capture Filter mit in der Zeile Remote Capture Command.