Blog posts

2022-03-22 – DNSSEC, ssh and VerifyHostKeyDNS

OpenSSH has this very nice setting, VerifyHostKeyDNS, which when enabled, will pull SSH host keys from DNS, and you no longer need to either trust on first use, or copy host keys around out of band.

Naturally, trusting unsecured DNS is a bit scary, so this requires the record to be signed using DNSSEC. This has worked for a long time, but then broke, seemingly out of the blue. Running ssh -vvv gave output similar to

debug1: found 4 insecure fingerprints in DNS
debug3: verify_host_key_dns: checking SSHFP type 1 fptype 2
debug3: verify_host_key_dns: checking SSHFP type 4 fptype 2
debug1: verify_host_key_dns: matched SSHFP type 4 fptype 2
debug3: verify_host_key_dns: checking SSHFP type 4 fptype 1
debug1: verify_host_key_dns: matched SSHFP type 4 fptype 1
debug3: verify_host_key_dns: checking SSHFP type 1 fptype 1
debug1: matching host key fingerprint found in DNS

even though the zone was signed, the resolver was checking the signature and I even checked that the DNS response had the AD bit set.

The fix was to add options trust-ad to /etc/resolv.conf. Without this, glibc will discard the AD bit from any upstream DNS servers. Note that you should only add this if you actually have a trusted DNS resolver. I run unbound on localhost, so if somebody can do a man-in-the-middle attack on that traffic, I have other problems.

2016-04-16 – Blog moved, new tech

I moved my blog around a bit and it appears that static pages are now in favour, so I switched to that, by way of Hugo. CSS and such needs more tweaking, but it’ll make do for now.

As part of this, RSS feeds and such changed, if you want to subscribe to this (very seldomly updated) blog, use https://err.no/personal/blog/index.xml

2015-04-22 – Temperature monitoring using a Beaglebone Black and 1-wire

I’ve had a half-broken temperature monitoring setup at home for quite some time. It started out with a Atom-based NAS, a USB-serial adapter and a passive 1-wire adapter. It sometimes worked, then stopped working, then started when poked with a stick. Later, the NAS was moved under the stairs and I put a Beaglebone Black in its old place. The temperature monitoring thereafter never really worked, but I didn’t have the time to fix it. Over the last few days, I’ve managed to get it working again, of course by replacing nearly all the existing components.

I’m using the DS18B20 sensors. They’re about USD 1 a piece on Ebay (when buying small quantities) and seems to work quite ok.

My first task was to address the reliability problems: Dropouts and really poor performance. I thought the passive adapter was problematic, in particular with the wire lengths I’m using and I therefore wanted to replace it with something else. The BBB has GPIO support, and various blog posts suggested using that. However, I’m running Debian on my BBB which doesn’t have support for DTB overrides, so I needed to patch the kernel DTB. (Apparently, DTB overrides are landing upstream, but obviously not in time for Jessie.)

I’ve never even looked at Device Tree before, but the structure was reasonably simple and with a sample override from bonebrews it was easy enough to come up with my patch. This uses pin 11 (yes, 11, not 13, read the bonebrews article for explanation on the numbering) on the P8 block. This needs to be compiled into a .dtb. I found the easiest way was just to drop the patched .dts into an unpacked kernel tree and then running make dtbs.

Once this works, you need to compile the w1-gpio kernel module, since Debian hasn’t yet enabled that. Run make menuconfig, find it under “Device drivers”, “1-wire”, “1-wire bus master”, build it as a module. I then had to build a full kernel to get the symversions right, then build the modules. I think there is or should be an easier way to do that, but as I cross-built it on a fast AMD64 machine, I didn’t investigate too much.

Insmod-ing w1-gpio then works, but for me, it failed to detect any sensors. Reading the data sheet, it looked like a pull-up resistor on the data line was needed. I had enabled the internal pull-up, but apparently that wasn’t enough, so I added a 4.7kOhm resistor between pin 3 (VDD_3V3) on P9 and pin (GPIO_45) on P8. With that in place, my sensors showed up in /sys/bus/w1/devices and you can read the values using cat.

In my case, I wanted the data to go into collectd and then to graphite. I first tried using an Exec plugin, but never got it to work properly. Using a python plugin worked much better and my graphite installation is now showing me temperatures.

Now I just need to add more probes around the house.

The most useful references were

In addition, various searches for DS18B20 pinout and similar, of course.

2014-11-16 – Resigning as a Debian systemd maintainer

Apparently, people care when you, as privileged person (white, male, long-time Debian Developer) throw in the towel because the amount of crap thrown your way just becomes too much. I guess that’s good, both because it gives me a soap box for a short while, but also because if enough people talk about how poisonous the well that Debian is has become, we can fix it.

This morning, I resigned as a member of the systemd maintainer team. I then proceeded to leave the relevant IRC channels and announced this on twitter. The responses I’ve gotten have been almost all been heartwarming. People have generally been offering hugs, saying thanks for the work put into systemd in Debian and so on. I’ve greatly appreciated those (and I’ve been getting those before I resigned too, so this isn’t just a response to that). I feel bad about leaving the rest of the team, they’re a great bunch: competent, caring, funny, wonderful people. On the other hand, at some point I had to draw a line and say “no further”.

Debian and its various maintainer teams are a bunch of tribes (with possibly Debian itself being a supertribe). Unlike many other situations, you can be part of multiple tribes. I’m still a member of the DSA tribe for instance. Leaving pkg-systemd means leaving one of my tribes. That hurts. It hurts even more because it feels like a forced exit rather than because I’ve lost interest or been distracted by other shiny things for long enough that you don’t really feel like part of a tribe. That happened with me with debian-installer. It was my baby for a while (with a then quite small team), then a bunch of real life thing interfered and other people picked it up and ran with it and made it greater and more fantastic than before. I kinda lost touch, and while it’s still dear to me, I no longer identify as part of the debian-boot tribe.

Now, how did I, standing stout and tall, get forced out of my tribe? I’ve been a DD for almost 14 years, I should be able to weather any storm, shouldn’t I? It turns out that no, the mountain does get worn down by the rain. It’s not a single hurtful comment here and there. There’s a constant drum about this all being some sort of conspiracy and there are sometimes flares where people wish people involved in systemd would be run over by a bus or just accusations of incompetence.

Our code of conduct says, “assume good faith”. If you ever find yourself not doing that, step back, breathe. See if there’s a reasonable explanation for why somebody is saying something or behaving in a way that doesn’t make sense to you. It might be as simple as your native tongue being English and their being something else.

If you do genuinely disagree with somebody (something which is entirely fine), try not to escalate, even if the stakes are high. Examples from the last year include talking about this as a war and talking about “increasingly bitter rear-guard battles”. By using and accepting this terminology, we, as a project, poison ourselves. Sam Hartman puts this better than me:

I’m hoping that we can all take a few minutes to gain empathy for those who disagree with us. Then I’m hoping we can use that understanding to reassure them that they are valued and respected and their concerns considered even when we end up strongly disagreeing with them or valuing different things.

I’d be lying if I said I didn’t ever feel the urge to demonise my opponents in discussions. That they’re worse, as people, than I am. However, it is imperative to never give in to this, since doing that will diminish us as humans and make the entire project poorer. Civil disagreements with reasonable discussions lead to better technical outcomes, happier humans and a healthier projects.

2013-11-29 – Redirect loop with interaktiv.nsb.no (and how to fix it)

I’m running a local unbound instance on my laptop to get working DNSSEC. It turns out that with the captive portal NSB (the Norwegian national rail company), this doesn’t work too well and you get into an endless series of redirects. Changing resolv.conf so you use the DHCP-provided resolver stops the redirect loop and you can then log in. Afterwards, you’re free to switch back to using your own local resolver.

2013-10-03 – Fingerprints as lightweight authentication

Dustin Kirkland recently wrote that “Fingerprints are usernames, not passwords”. I don’t really agree, I think fingerprints are fine for lightweight authentication. iOS at least allows you to only require a pass code after a time period has expired, so you don’t have to authenticate to the phone all the time. Replacing no authentication with weak authentication (but only for a fairly short period) will improve security over the current status, even if it’s not perfect.

Having something similar for Linux would also be reasonable, I think. Allow authentication with a fingerprint if I’ve only been gone for lunch (or maybe just for a trip to the loo), but require password or token if I’ve been gone for longer. There’s a balance to be struck between convenience and security.

2013-06-27 – Getting rid of NSCA using Python and Chef

NSCA is a tool used to submit passive check results to nagios. Unfortunately, an incompatibility was recently introduced between wheezy clients and old servers. Since I don’t want to upgrade my server, this caused some problems and I decided to just get rid of NSCA completely.

The server side of NSCA is pretty trivial, it basically just adds a timestamp and a command name to the data sent by the client, then changes tabs into semicolons and stuffs all of that down Nagios' command pipe.

The script I came up with was:

#! /usr/bin/python
# -* coding: utf-8 -*-

import time
import sys

# format is:
# [TIMESTAMP] COMMAND_NAME;argument1;argument2;…;argumentN
#
# For passive checks, we want PROCESS_SERVICE_CHECK_RESULT with the
# format:
#
# PROCESS_SERVICE_CHECK_RESULT;<host_name>;<service_description>;<return_code>;<plugin_output>
#
# return code is 0=OK, 1=WARNING, 2=CRITICAL, 3=UNKNOWN
#
# Read lines from stdin with the format:
# $HOSTNAME\t$SERVICE_NAME\t$RETURN_CODE\t$TEXT_OUTPUT

if len(sys.argv) != 2:
    print "Usage: {0} HOSTNAME".format(sys.argv[0])
    sys.exit(1)
HOSTNAME = sys.argv[1]

timestamp = int(time.time())
nagios_cmd = file("/var/lib/nagios3/rw/nagios.cmd", "w")
for line in sys.stdin:
    (_, service, return_code, text) = line.split("\t", 3)
    nagios_cmd.write(u"[{timestamp}] PROCESS_SERVICE_CHECK_RESULT;{hostname};{service};{return_code};{text}\n".format
                     (timestamp = timestamp,
                      hostname = HOSTNAME,
                      service = service,
                      return_code = return_code,
                      text = text))

The reason for the hostname in the line (even though it’s overridden) is to be compatible with send_nsca’s input format.

Machines submit check results over SSH using its excellent ForceCommand capabilities, the Chef template for the authorized_keys file looks like:

<% for host in @nodes %>
command="/usr/local/lib/nagios/nagios-passive-check-result <%= host[:hostname] %>",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa <%= host[:keys][:ssh][:host_rsa_public] %> <%= host[:hostname] %>
<% end %>

The actual chef recipe looks like:

nodes = []
search(:node, "*:*") do |n|
  # Ignore not-yet-configured nodes                                                                       
  next unless n[:hostname]
  next unless n[:nagios]
  next if n[:nagios].has_key?(:ignore)
  nodes << n
end
nodes.sort! { |a,b| a[:hostname] <=> b[:hostname] }
print nodes

template "/etc/ssh/userkeys/nagios" do
  source "authorized_keys.erb"
  mode 0400
  variables({
              :nodes => nodes
            })
end

cookbook_file "/usr/local/lib/nagios/nagios-passive-check-result" do
  mode 0555
end

user "nagios" do
  action :manage
  shell "/bin/sh"
end

To submit a check, hosts do:

printf "$HOSTNAME\t$SERVICE_NAME\t$RET\t$TEXT\n" | ssh -i /etc/ssh/ssh_host_rsa_key -o BatchMode=yes -o StrictHostKeyChecking=no -T nagios@$NAGIOS_SERVER

2013-06-18 – An otter, please (or, a better notification system)

Recently, there’s been discussions on IRC and the debian-devel mailing list about how to notify users, typically from a cron script or a system daemon needing to tell the user their hard drive is about to expire. The current way is generally “send email to root” and for some bits “pop up a notification bubble, hoping the user will see it”. Emailing me means I get far too many notifications. They’re often not actionable (apt-get update failed two days ago) and they’re not aggregated.

I think we need a system that at its core has level and edge triggers and some way of doing flap detection. Level interrupts means “tell me if a disk is full right now”. Edge means “tell me if the checksums have changed, even if they now look ok”. Flap detection means “tell me if the nightly apt-get update fails more often than once a week”. It would be useful if it could extrapolate some notifications too, so it could tell me “your disk is going to be full in $period unless you add more space”.

The system needs to be able to take in input in a variety of formats: syslog, unstructured output from cron scripts (including their exit codes), snmp, nagios notifications, sockets and fifos and so on. Based on those inputs and any correlations it can pull out of it, it should try to reason about what’s happening on the system. If the conclusion there is “something is broken”, it should see if it’s something that it can reasonably fix by itself. If so, fix it and record it (so it can be used for notification if appropriate: I want to be told if you restart apache every two minutes). If it can’t fix it, notify the admin.

It should also group similar messages so a single important message doesn’t drown in a million unimportant ones. Ideally, this should be cross-host aggregation. The notifications should be possible to escalate if they’re not handled within some time period.

I’m not aware of such a tool. Maybe one could be rigged together by careful application of logstash, nagios, munin/ganglia/something and sentry. If anybody knows of such a tool, let me know, or if you’re working on one, also please let me know.

2013-03-22 – Sharing an SSH key, securely

Update: This isn’t actually that much better than letting them access the private key, since nothing is stopping the user from running their own SSH agent, which can be run under strace. A better solution is in the works. Thanks Timo Juhani Lindfors and Bob Proulx for both pointing this out.

At work, we have a shared SSH key between the different people manning the support queue. So far, this has just been a file in a directory where everybody could read it and people would sudo to the support user and then run SSH.

This has bugged me a fair bit, since there was nothing stopping a person from making a copy of the key onto their laptop, except policy.

Thanks to a tip, I got around to implementing this and figured writing up how to do it would be useful.

First, you need a directory readable by root only, I use /var/local/support-ssh here. The other bits you need are a small sudo snippet and a profile.d script.

My sudo snippet looks like:

Defaults!/usr/bin/ssh-add env_keep += "SSH_AUTH_SOCK"
%support ALL=(root)  NOPASSWD: /usr/bin/ssh-add /var/local/support-ssh/id_rsa

Everybody in group support can run ssh-add as root.

The profile.d goes in /etc/profile.d/support.sh and looks like:

if [ -n "$(groups | grep -E "(^| )support( |$)")" ]; then
    export SSH_AUTH_ENV="$HOME/.ssh/agent-env"
    if [ -f "$SSH_AUTH_ENV" ]; then
        . "$SSH_AUTH_ENV"
    fi
    ssh-add -l >/dev/null 2>&1
    if [ $? = 2 ]; then
        mkdir -p "$HOME/.ssh"
        rm -f "$SSH_AUTH_ENV"
        ssh-agent > "$SSH_AUTH_ENV"
        . "$SSH_AUTH_ENV"
    fi
    sudo ssh-add /var/local/support-ssh/id_rsa
fi

The key is unavailable for the user in question because ssh-add is sgid and so runs with group ssh and the process is only debuggable for root. The only thing missing is there’s no way to have the agent prompt to use a key and I would like it to die or at least unload keys when the last session for a user is closed, but that doesn’t seem trivial to do.

2013-01-29 – Abusing sbuild for fun and profit

Over the last couple of weeks, I have been working on getting binary packages for Varnish modules built. In the current version, you need to have a built, unpacked source tree to build a module against. This is being fixed in the next version, but until then, I needed to provide this in the build environment somehow.

RPMs were surprisingly easy, since our RPM build setup is much simpler and doesn’t use mock/mach or other chroot-based tools. Just make a source RPM available and unpack + compile that.

Debian packages on the other hand, they were not easy to get going. My first problem was to just get the Varnish source package into the chroot. I ended up making a directory in /var/lib/sbuild/build which is exposed as /build once sbuild runs. The other hard part was getting Varnish itself built. sbuild exposes two hooks that could work: a pre-build hook and a chroot-setup hook. Neither worked: Pre-build is called before the chroot is set up, so we can’t build Varnish. Chroot-setup is run before the build-dependencies are installed and it runs as the user invoking sbuild, so it can’t install packages.

Sparc32 and similar architectures use the linux32 tool to set the personality before building packages. I ended up abusing this, so I set HOME to a temporary directory where I create a .sbuildrc which sets $build_env_cmnd to a script which in turns unpacks the Varnish source, builds it and then chains to dpkg-buildpackage. Of course, the build-dependencies for modules don’t include all the build-dependencies for Varnish itself, so I have to extract those from the Varnish source package too.

No source available at this point, mostly because it’s beyond ugly. I’ll see if I can get it cleaned up.