All posts by The Steel Geek

Structural engineer, automotive geek, and general enthusiast for making ... well, almost anything. Also the proprietor here. :-)

January 2024 ROOter updates

Only a handful of news on my end right now, thanks to busy jobs on my part taking away from router fun. Here are the updates on the parts that I’m involved with:

New ROOter build systems

Dairyman has released four new build systems since my September update. All four are “special purpose” systems meant to cover a handful of routers. They are:

  • SourceMaster (link) to cover MT3000, WS1698, X3000, and Z8102. This is a fork off of the OpenWRT master as of roughly 27 Nov 2023.
  • SourceBPIR4 (link) to cover BPI4, the Banana Pi R4. Currently a single image system.
  • SourceRPI5 (link) to cover RASPPI5 and RASPPI5USB, which should be the Raspberry Pi 5 board with either the usual SD card or with USB mounted root file systems.
  • Source1800 (link) to cover AW1000, AX1800, and AXT1800. These are the Arcadyan AW1000 and two GL-iNet variants, all with ipq chipsets.

None of these are currently expected to become major build systems, but they will be used where necessary to support new systems not yet covered by one of our main systems. I’m still hearing 21.02 is likely to stay our primary system for a while, as 22.03 and 23.05 are producing much larger images that aren’t going to work on a lot of routers we currently support. Kernel growth has been a big issue in particular for several years now.

Build machine upgrades

The autobuild machine has also gotten an upgrade around New Years. With the current number of router images, the main system SSD was staying around 80-90% full. The build machine now has a second SSD of the same size, so we’re set for quite a bit of future growth.

This laptop has been running the autobuild systems for almost two years now. Not bad for a total investment of roughly $250 US – all this is done off of a salvaged Dell e6410 laptop with a 2 core (4 virtual) i7 chip and 8GB of RAM, with two 1TB SSD’s. It has cranked out roughly 250 images per week in recent builds, usually in 55h or less. Never underestimate the value of ancient hardware to do real work, as these laptops were released in 2010. With weekly builds only taking roughly 1/3 of the laptop’s possible duty cycle (in hours of the week), I expect this little guy should be up for the task for a lot longer barring catastropic failure.

One point of interest is that the CPU load is shown on the build system status page here the same as it is on a ROOter status page (which is the number given by the uptime command). In the first part of the build, it’s not unusual to see utilizations hovering in the upper 20’s. Since this is a two core machine, with hyperthreading, it can handle a load of roughly 1.4 per core depending on the type of load, so for anything over about 2.8, the CPU is 100% utilized. Something to remember when people tell you to tune your builds so that your CPU load stays under your core count – that’s absolute BS. For best builds, you want your load ABOVE your core count. The lower recommendations are to make sure your system stays responsive, but it’s much, much better to handle that using the nice command when you start your build. Despite my CPU being overloaded by an order of magnitude, the machine is still perfectly responsive, and the only way to tell how hard it’s working is by hearing the fan spun up or seeing that load number.

Docker Autobuild Image updates

The base image for the autobuild containers has been updated to Debian 11.8 from 11.7. This is because one of the new build systems needed a pair of new build tools added. This can be done directly within the container for testing, but really any change in the toolchain is work updating the base image instead. This keeps the resulting images a lot more consistent.

The new base image has been applied to all 9 build systems (which took half an evening), tested, gone through a week of full builds, and all of the build system changes have been pushed to github with the new images already pushed to Docker Hub.

Bonus: there’s one more feature announcement coming (which the folks on the forum today will have already seen), but it warrants its own post.

September 2023 ROOter updates

I’ve had a few chances over the last few weeks to make minor updates to the autobuild systems. Nothing major. This month’s news:

  • The autobuild status system has been reconfigured to update every minute, with an approximately one minute lag.
  • The status has also been added to a new Autobuilds headline page, easily accessible from the top menu. This is probably the quickest way to see the current system status. It doesn’t auto update, though, you’ll have to refresh occasionally.
  • A while back we were able to finish uploading all of the old autobuild images to carp4’s Autobuild archive space. Everything should be there, from just after the first runs, sorted by router name. You may find a few routers that are split into several folders due to name changes over time, but everything seems to be where it belongs. This archive also updates with new images within a day or so of them being built.
  • The old 18.06.7 system was retired about a week ago. For at least a year or more it had only been building the WR703N, which barely fit an image with the older kernel. New images no longer fit, so this one is finally wound down.

I have a few more tweaks on my wish list, but everything gets fit between the two real jobs, both of which have been hopping this year. Maybe more time will appear as things go on.

ROOter Autobuilds Archive

This has already been linked from the forum and the autobuilds page, but still great news. We only have space on the server to host the most recent two images for each router. However, user carp4 has donated a significant amount of space on a Google Drive to host all of the older images. Without further ado:

ROOter Autobuild Archive

It will take two to three months to upload all of the older images – there are almost half of a terabyte worth already. They’re all on a local drive now, so will trickle online during off-peak hours overnight. I rough estimate would be end of June, 2023 to have them all online, if all goes well.

My Samsung Galaxy S20 FE UW debloat list

I know I’m not the only one who’s driven nuts by all the extra stuff loaded on most new phones. I honestly mostly prefer a “stock” Android experience, but do prefer some of Samsung’s hardware. I just don’t like their software. Thankfully, ADB makes it pretty easy to rip most of their software out by the roots, even the things that are blocked from disabling via the user interface.

And that’s good, since this is the second one of these phones I’ve set up in about nine months (thanks to a catastrophe a few days ago, and decent insurance coverage). Unfortunately I didn’t save a list of what I removed to debloat last time, so this time once I got it workable I exported the list of disabled apps.

There are lots of other good references on how to set up ADB and how to disable individual packages, so the below is simply a copy of my own disable list. This rips out almost everything Samsung and a good bit of the garbage Verizon preloaded. It should be very similar to what I had removed from my last phone, which was very nice to use after the junk-ectomy.

Without further ado, the list:

com.diotek.sec.lookup.dictionary
com.google.android.apps.tachyon
com.LogiaGroup.LogiaDeck
com.microsoft.appmanager
com.microsoft.skydrive
com.netflix.mediaclient
com.netflix.partner.activation
com.samsung.android.app.camera.sticker.facearavatar.preload
com.samsung.android.app.contacts
com.samsung.android.app.routines
com.samsung.android.app.settings.bixby
com.samsung.android.app.spage
com.samsung.android.app.watchmanagerstub
com.samsung.android.ardrawing
com.samsung.android.aremoji
com.samsung.android.arzone
com.samsung.android.authfw
com.samsung.android.beaconmanager
com.samsung.android.bixby.agent
com.samsung.android.bixby.service
com.samsung.android.bixbyvision.framework
com.samsung.android.bixby.wakeup
com.samsung.android.calendar
com.samsung.android.da.daagent
com.samsung.android.forest
com.samsung.android.game.gamehome
com.samsung.android.game.gametools
com.samsung.android.livestickers
com.samsung.android.mateagent
com.samsung.android.mdx
com.samsung.android.messaging
com.samsung.android.samsungpass
com.samsung.android.samsungpassautofill
com.samsung.android.scloud
com.samsung.android.service.stplatform
com.samsung.android.stickercenter
com.samsung.android.tapack.authfw
com.samsung.android.themestore
com.samsung.android.visionintelligence
com.samsung.desktopsystemui
com.samsung.systemui.bixby2
com.sec.android.app.billing
com.sec.android.app.clockpackage
com.sec.android.app.desktoplauncher
com.sec.android.app.dexonpc
com.sec.android.app.samsungapps
com.sec.android.app.sbrowser
com.sec.android.autodoodle.service
com.sec.android.daemonapp
com.sec.android.desktopmode.uiservice
com.sec.spp.push
com.securityandprivacy.android.verizon.vms
com.vcast.mediamanager

Since I see far fewer explanations of how to conveniently generate this list of everything you’ve manually disabled, here’s the command you need to run (in Linux from outside adb shell, because you need to use command line tools to sift things out that aren’t available within most phone operating systems):

diff <(adb shell pm list packages) <(adb shell pm list packages -u) > ~/rawlist.txt

That command takes a list of all current packages, then takes a list of all packages including uninstalled, subtracts the good packages from the bad, and saves the results into a list in your home directory (in this case named rawlist.txt).

The results produced by that command are pretty ugly, so you can use the following command to trim out all the gibberish and leave just the clean package list above:

cat rawlist.txt | grep "> package" | sed 's/> package://' | sort > ~/cleanlist.txt

That leaves you a clean list of uninstalled packages from the stock image that you could even use as an input for a script to bulk-uninstall everything using ADB again later. That’s my real end goal here – if something large and steel falls on my poor phone again, I don’t want to have to spend another two hours remembering what junk I ripped out, when I could save the list, and whip up a script to remove them all in about ten minutes. Wish I’d thought of that the first time.

Filtering logd in OpenWRT

So I’ve been running into an odd problem with log spam in my ring buffer log, and just kludged together a solution when I couldn’t find one elsewhere. I use nft-qos on my main OpenWRT router, and some days it spams my logs like crazy with lines that look like this:

Aug 22 04:15:11 hexenbiest nft-qos-monitor: ACTION=update, MACADDR=a8:86:dd:ac:73:ad, IPADDR=192.168.1.191, HOSTNAME=
Aug 22 04:15:12 hexenbiest nft-qos-dynamic: ACTION=update, MACADDR=a8:86:dd:ac:73:ad, IPADDR=192.168.1.191, HOSTNAME=
Aug 22 04:26:46 hexenbiest nft-qos-monitor: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:26:47 hexenbiest nft-qos-dynamic: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:26:50 hexenbiest nft-qos-monitor: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:26:50 hexenbiest nft-qos-dynamic: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:26:59 hexenbiest nft-qos-monitor: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:26:59 hexenbiest nft-qos-dynamic: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:27:15 hexenbiest nft-qos-monitor: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick
Aug 22 04:27:15 hexenbiest nft-qos-dynamic: ACTION=update, MACADDR=a4:08:01:3c:9e:bb, IPADDR=192.168.1.207, HOSTNAME=FireStick

That’s just a sampling – it’s not just the one device, more often it will spit out a list of every device on the network. Sometimes it does that multiple times per minute. Other times, it’ll go days and never do it once. I don’t have time to track down why nft-qos is logging this, or figure out how to turn it off, but I did have good luck with this kludge to filter it out.

Most people won’t need to filter this unless you’re using your logs for something else. I send my logs to an rsyslog server in my LAN and filter it there, so this isn’t an issue in my long term logs – only in the ring buffer copy that stays on the router itself. So why does it matter? I have a script that runs once a minute to count missed pings from mwan3 and update a conky dashboard on one of my monitoring servers, and it’s much more efficient if I have it parse the ring buffer, instead of asking the rsyslog server across the network once a minute for the whole day’s log!

In hunting around, I’ve seen a dozen or so people asking how to filter logs on OpenWRT, and the answer is universally to either implement rsyslog on the router, or send it to an external syslog server. My case is obviously a corner case, but it’s one that’s obviously come up enough that other people have asked, and I’ve never seen a solution that actually works with logd and logger on the box itself without an external server.

So I made one. And it turned out to be ridiculously simple. It’s also very easy for you to modify to filter out anything else you need to as well.

The Fix, without further ado

What you need to know: logger on OpenWRT isn’t a real program. Entering “type logger” in your shell will point to /usr/bin/logger, but that’s not a real file, just another link to /bin/busybox. When you type

logger -t testlog "Some test message"

into your terminal on OpenWRT, it just gets redirected into

/bin/busybox logger -t testlog "Some test message"

and busybox does the work. Filtering the log input is as simple as moving the link at /usr/bin/logger out of the way, and creating the following new ash script at /usr/bin/logger:

#!/bin/ash
# We are going to filter input to logger by intercepting it.

# Parse options if they're there, else use defaults.
STDERR=1
TAG=$USER
PRIO=0

while getopts st:p: flag ; do
        case "${flag}" in
                s) STDERR=0 ;;
                t) TAG=${OPTARG} ;;
                p) PRIO=${OPTARG} ;;
                *) exit 1 ;;
        esac
done

shift $((OPTIND -1))

# now filter the input string and exit early if we pick up a filtered string.

# Message filters (chain more if needed):
FILTERED=`echo "$*" | grep -v 'ACTION=update,'`

# Exit trigger lines (add more if needed):
[ -z "$FILTERED" ] && exit 0

# We have a good filtered string. Send it to the real logger.

if [ $STDERR -eq 1 ] ; then
        echo "$FILTERED" | /bin/busybox logger -t $TAG -p $PRIO
else
        echo "$FILTERED" | /bin/busybox logger -s -t $TAG -p $PRIO
fi

Now, here’s what that script does:

  • Parse any flags on the input using getopts. Busybox logger only accepts the three flags shown, so this is fairly simple. We’re setting defaults for all three, and then if a flag is actually given, replacing the defaults with what was given.
  • Shift the input over by the number of flags and flag parameters, leaving only the log message.
  • Using grep -v (think “grep inverse”) to filter according to the message content. This FILTERED line is either going to return exactly what you give it unchanged, or nothing at all if it matches your filter. It’s easy to add more filters here if needed, more info below.
  • Checking to see if the message has been filtered out (the code “[ -z “$FILTERED” ]” just returns success if the string you feed it is zero length) and exiting the script if so, basically discarding this log entry. This line is important to remember later if you want to filter on things like tags or priority, so remember it as an “exit trigger line.”
  • Finally, passing the message to the REAL logger facility, using either the default flag values or the values given in the input. The syslog flag is a true or false, and in doing this quickly I found it easiest to just split that using an if statement.

Modifying the script for your own purposes

Want to know how to modify this to make it your own, but not a bash scripting guru? It’s pretty simple, so I’ll give you the framework for the most common changes.

Changing it to filter based on different text is the easiest – just change what’s in the single quotes on the filter line. If you wanted to filter out messages where the content is “foobar” for example:

FILTERED=`echo "$*" | grep -v 'foobar'`

Changing it to filter multiple different things is also fairly easy. You just need to duplicate the filter line, and change any lines after the first to work with the output of the line before it instead of $*. You can chain together however many you want, at the end FILTERED is either going to contain your original input or just be a zero length “” string if a filter matched. If we wanted my original filter plus foo and bar:

FILTERED=`echo "$*" | grep -v 'ACTION=update,'`
FILTERED=`echo "$FILTERED" | grep -v 'foo'`
FILTERED=`echo "$FILTERED" | grep -v 'bar'`

You can change it to match anything in the message this way. One note, grep as it’s called there is also case sensitive, and all the other usual caveats around that kind of thing still apply – but overall, it’s easy to tweak these message filters any way you want.

What about filtering based on a whole tag? What if you want to filter out all log entries with the tag “testlog”? Now, instead of modifying the filter string, you need to modify the exit trigger line (if you only want to filter on the tag, not the message) or add additional exit trigger lines if you want to filter on both message and tags.

Example, filtering on one or more messages as well as tag “testlog”:

[ -z "$FILTERED" ] && exit 0
[ "$TAG" = "testlog" ] && exit 0

You can add as many exit triggers as you need – matching any one of them will cause the message to be discarded. What if you also want to filter by message priority, discarding any message with priority lower than (number higher than) 5? This is a functionality that’s been broken in OpenWRT since 2013, but it turns out it’s also trivial to fix by adding an exit trigger line:

[ -z "$FILTERED" ] && exit 0
[ "$TAG" = "testlog" ] && exit 0
[ "$TAG" = "foobar" ] && exit 0
[ "$PRIO" -gt 5 ] && exit 0

That one filters one one or more messages, plus two tags and priority. As you can see, you really can easily tweak this to do whatever you want.

In short, log filtering has been broken ever since the switch to logd, and it turns out it was always trivial to implement and no one has bothered. I’ve kludged it myself, but I don’t have the time to go implement a real replacement in the original source either – I wish I did! If you do, please feel free to take this framework and run with it. It would be almost as easy to add reading of filter settings from uci into this script so that we could actually make settings in /etc/config/system like conloglevel functional again. You could trivially include lots of message filter strings in /etc/config/system, and have the script loop through them instead of hard coding it like my examples above. Really, this is a trivial framework for fixing OpenWRT’s long-broken log filtering, and all it needs is someone with time and willingness to run with it.

But meanwhile, since you’re probably like me and just looking for a quick way to fix this – here you go, tweak as you see fit and enjoy your spam-free ring buffer logs!

Automatic Autossh Tunnel under Systemd

It took me half the afternoon to figure out how to do this with all the bells and whistles (mostly a learning experience on systemd), so I’d better write it down! Meanwhile, it took me at least a dozen reference docs to piece it together, because autossh has pretty brief documentation.

Edit 2021-03-27: One thing in the original version of this article that didn’t work as intended was keeping the systemd unit file in a different location and linking it where it needed to be. That turns out not to work right under reload. Further info below.

I had a remote server running a mail forwarder which is password protected but exposed, yet I was only using it from devices within my LAN. That’s not only not ideal, but a smidge less reliable due to various odd properties of my LAN (with regard to WAN failover behaviors), so I wanted to move this traffic into an SSH tunnel. My local tunnel endpoint would be a Debian 10 (Buster) machine that is fully within the LAN, not a perimeter device. I want this connection to come up fully automatically on boot/reboot, and give me simple control as a service (hence systemd).

Remote: an unprivileged, high port PPPPP on server RRR.RRR.RRR.RRR, whose ssh server listens on port xxxxx.

Local: the same unprivileged, high port PPPPP on server LLL.LLL.LLL.LLL

My remote machine was already appropriately set up, all I needed to do was add an unprivileged user account to add this connection. In the code below, you’ll see the account name (on both the local and remote servers) is raul, which is also the local server’s name on the network. Substitute your account name of choice wherever you see this. Before beginning the real work, you need this account set up on both machines, with key based authentication (with no pass phrase). Log into the remote machine from the local account at least once, to verify it works with the certificate, and to store the host key.

Install autossh on your local Debian machine with sudo apt install autossh.

Since everything will be run as your unprivileged user, it’s actually a bit easier to do all your initial editing from that account so that you don’t have to play with permissions later. That’s not how I started, but it would’ve saved me some steps. So, switch to your new account with su raul or equivalent. We’ll be keeping the three files necessary to run this in that account’s home directory at /home/raul/, but one of them is created automatically by your script, so we really only need to write the startup script and the systemd unit.

One thing to note beforehand – the formatting and word wrapping on this site can make it less than obvious what’s an actual newline in code snippets, and what’s just word wrap. Because of this, I’ve linked a copy of the maillink.sh script where you can get to it directly, and just change out your account name, addresses, and ports.

Startup Script

First, we’ll create the startup script, which is the meat of the work. Without further ado, create /home/raul/maillink.sh:

#!/bin/bash

# This script starts an ssh tunnel to matter to locally expose the mail port, PPPPP.                                                                           

logger -t maillink.sh "Begin autossh setup script for mail link."
S_TIME=`awk '{print int($1)}' /proc/uptime`
MAX_TIME="300"

# First, verify connection to outside world is working. This bit is optional.

while true ; do
        M_RESPONSE=`ping -c 5 -A RRR.RRR.RRR.RRR | grep -c "64 bytes from"`
        C_TIME=`awk '{print int($1)}' /proc/uptime`
        E_TIME=`expr $C_TIME - $S_TIME`
        [[ $M_RESPONSE == "5" ]] && break
        if  [ $E_TIME -gt $MAX_TIME ]
        then
                logger -t maillink.sh "Waiting for network, timed out after $E_TIME seconds."                                                                  
                exit 1
        fi
        sleep 10
done

logger -t maillink.sh "Network detected up after $E_TIME seconds."

# Now, start the tunnel in the background.

export AUTOSSH_PIDFILE="/home/raul/maillink.pid"

autossh -f -M 0 raul@RRR.RRR.RRR.RRR -p xxxxx -N \
        -o "ServerAliveInterval 60" -o "ServerAliveCountMax 3" \
        -L LLL.LLL.LLL.LLL:PPPPP:127.0.0.1:PPPPP

MAILPID=`cat /home/raul/maillink.pid`

logger -t maillink.sh "Autostart command run, pid is $MAILPID."

As you can see, this script has a few more bells and whistles than just the most basic. While systemd should take care of making sure we wait until the network is up, I want the script to verify that it can reach my actual remote server before it starts the tunnel. It will try every ten seconds for up to five minutes, until it gets 100% of its test pings back in one go.

I also like logs … putting a few log lines into your script sure makes it easier to figure out why things are going wrong.

The “AUTOSSH_PIDFILE” line is necessary for making this setup play pretty with systemd and give us a way to stop the tunnel the nice way (systemctl stop maillink) instead of manually killing it. That environment variable will cause autossh to store its pid once it starts up. Autossh responds to a nice, default kill by neatly shutting down the ssh tunnel and itself, so it makes that control easy. Of course, finding that and figuring out how to do it was … less easy, but it’s simple once you know. That pid file is the second important file in making this work, but this line creates it automatically whenever the tunnel is working.

Now, the meat of this file is the autossh command. Some of these are for autossh itself, and most get passed to ssh. Here’s a breakdown of each part of this command:

  • -f moves the process to background on startup (this is an autossh flag)
  • -M 0 turns off the built in autossh connection checking system – we’re relying on newer monitoring built into OpenSSH instead.
  • raul@RRR.RRR.RRR.RRR -p xxxxx are the username, address, and port number of the remote server, to be passed to ssh (along with all that follows). If your remote server uses the default port of 22, you can leave the port flag out. If your local and remote accounts for this service will be the same, you can also leave out the account name, but I find it clearer left in.
  • -N tells ssh not to execute any remote commands. The manpage for ssh mentions this is useful for just forwarding ports. What nothing tells you, is that in my experience this autossh tunnel simply won’t work without the -N flag. It failed silently until I added it in.
  • -o “ServerAliveInterval 60” -o “ServerAliveCountMax 3” are the flags for the built in connection monitoring we’ll be using. ServerAliveInterval causes the client to send a keep-alive null packet to the server at the specified interval in seconds, expecting a response. ServerAliveCountMax sets the limit of how many times in a row the response can fail to arrive before the client automatically disconnects. When it does disconnect, it will exit with a non-zero status (i.e. “something went wrong”), and autossh will restart the tunnel – that’s the main function of autossh. If you kill the ssh process intentionally, it returns 0 and autossh assumes that’s intentional, so it will end itself.
  • -L LLL.LLL.LLL.LLL:PPPPP:127.0.0.1:PPPPP is the real meat of the command, as this is the actual port forward. It translates to, “Take port PPPPP of the remote server’s loopback (127.0.0.1) interface, and expose it on port PPPPP of this local client’s interface at address LLL.LLL.LLL.LLL.” That local address is optional, but if you don’t put it in, it will default to exposing that port on the local client’s loopback interface, too. That’s great if you just need to access it from the client computer, but I needed this port exposed to the rest of my LAN.

One handy thing to note, is that you can forward multiple ports through this single tunnel. You can just keep repeating that -L line to forward however many ports you need. Or, if you’re forwarding for various different services that you might not want all active at the same time, it’s easy to duplicate the startup script and service file, tweak the names and contents, and have a second separate service to control.

Before you test this the first time, it’s important to make sure it’s executable!

chmod +x /home/raul/maillink.sh

At this point, if you aren’t looking to make this a systemd service, you’re done – you can actually just start your connection manually using “. /home/raul/maillink.sh” (note the . and space up front) and stop it manually using kill <pid>, where the pid is the number saved in the maillink.pid file. (If you’re planning to do this, it’s actually easiest to keep these in your main user’s home directory and modify the script for the pid location accordingly.) At this point, you should manually test the script to ensure everything is working the way you expected. You should see some helpful output in your syslog, and you should also see that port listening on your local machine if you run netstat -tunlp4.

Systemd Unit

However, with just a little more work, making this controllable turns out to be pretty simple. It took way longer to corral the info on how to do it, than it would’ve taken to do if I’d already known how.

Edit 2021-03-27: I originally tried placing this unit file in /home/raul/ and sym linking it into /etc/systemd/system/. That … well, doesn’t work. It works fine the first time, when you run systemctl daemon-reload to pull the unit into the system. The problem is, for whatever reason systemd will not find that file on reboot, even though the link is still there. You’d have to reload manually every time, which just defeats the purpose. Edits have been made accordingly below.

First, create the systemd unit file, which must be located in /etc/systemd/system/maillink.service:

[Unit]
Description=Keeps a tunnel to remote mailserver open
Wants=network-online.target
After=network-online.target

[Service]
Type=forking
RemainAfterExit=yes
User=raul
Group=raul
ExecStart=/home/raul/maillink.sh                                                                                                                               
ExecStop=kill `cat /home/raul/maillink.pid`

[Install]
WantedBy=multi-user.target

This file contains:

  • A description
  • Two lines that ensure it won’t run until after networking is up
  • Two lines that instruct the system to run it under your “raul” account
  • A command to be run to start the service
  • A command to be run to stop the service – this shuts it down clean using the pid file we saved on startup

Next, we need to update systemd to see the new service:

sudo systemctl daemon-reload

Now your systemd unit is loaded. You should be able to verify this by running systemctl status maillink, which should give you a bit of information on your service and its description.

Next, we can test the systemd setup out. First start the service once using sudo systemctl start maillink, and make sure it starts without error messages. Check it as well with systemctl status maillink, and verify the port is there using netstat -tnlp4.

If all went well, that status command should give you some nice output, including the process tree for the service, and an excerpt of relevant parts of the syslog.

Make sure you also verify the stop command, with systemctl stop maillink. This should turn off the port in netstat, and you should also no longer see autossh or ssh when you run ps -e.

If all looks good, you’re good to set this up and leave it! Enable the service to autostart using systemctl enable maillink, and if it’s not started already, start it back up with systemctl start maillink.

And, here’s hoping this was clear and helpful. If you catch any bugs, please let me know!

Quick and Dirty Live View of rsyslog Output

I mentioned in a post yesterday that I was watching the syslog of my router to see when it sent a boot image over tftp. However, OpenWRT does not have a “live updating” syslog view – so how was I doing that, just clicking refresh over and over with my third hand, while holding down a reset button with the other two? No, there’s a really easy way to do this with a stupidly simple bash script.

My routers use remote logging to an internal rsyslog server on my LAN, and you’ll see my script is set up to support that. However, this is very easy to modify to support OpenWRT’s default logging, as well.

Without further ado, here’s the script, which lives in my log folder:

#!/bin/sh

# Live log display

while true; do
        tail -n 52 $1
        sleep 5
done

My various consoles I log into this from have anywhere from a 40 to 50 line display set up, hence the “52” – it’s reading and displaying the last 52 lines of the log every five seconds. If you always use shorter consoles, you can easily trim this down.

By passing the name of the script you want to read, this script has also been made “universal” in that it can be used to monitor any log on your machine. I also use it on a couple of my other servers, with no modifications. If you want to monitor “hexenbiest.log” you simply cd into the appropriate log folder, and run:

./loglive hexenbiest.log

Stock OpenWRT doesn’t write the log to a real file, it uses a ring buffer in memory that may be accessed using the command logread. To modify this script to run on a stock OpenWRT router, place it in the home folder (/root/) instead, and modify it to read accordingly:

#!/bin/sh

# Live log display

while true; do
        logread | tail -n 52
        sleep 5
done

This way, instead of reading the last 52 lines of a file every five seconds, it’s reading the last 52 lines of the logread output instead.

You might think it would make sense to clear the terminal before each output, but I didn’t personally find that helpful. In fact, it resulted in a visible flicker every time the log updated. Helpful if you need to know each time it reads, but I didn’t find that useful myself.

Using dnsmasq under OpenWRT as a TFTP boot server

Update (2022-03-13): this past year, Mikrotik did something that broke their ability to netboot over DHCP (affecting any RouterOS version newer than 6.46.6), which makes flashing these routers using most people’s usual methods (often tftp32, which is distributed with ROOter images, for example) much more difficult. The method in this article is unaffected, and still works fine. Using this method, dnsmasq is actually transparently handling the Mikrotik using BOOTP which is not broken, and is Mikrotik’s default netboot mode. I didn’t even realize this when first writing the article, because at the time it didn’t matter. However, I just flashed a new hEX yesterday using these instructions, and it was on RouterOS 6.47.9.

Lots of routers now offer a nice little web interface you can use to upload firmware. However, there are still a lot of routers that are easiest to flash using netboot methods like tftp. There are plenty of tutorials on doing this, but most focus on using a server installed on your computer. If this is a second router and you already have a working OpenWRT main router, it’s often actually much easier to just use your main router to TFTP boot, which is something dnsmasq (the default DHCP and DNS server) can do out of the box.

In my case, I already have a primary router with external USB storage up and running. This brief tutorial gives the bare bones steps on what you need to do to use this to flash a second router that supports netboot. I’ll be flashing a Mikrotik hEX RB750Gr3 in this example, since I had one I needed to do anyway. If you don’t already have some external storage set up on your main router, take care of that first – the existing tutorials for that are pretty good, so I won’t duplicate that here.

First, boot up your new router at least once and get its MAC address. For some reason things will go more smoothly if you assign it a static IP when it first boots up as a DHCP client.

Configure /etc/config/dhcp (which controls dnsmasq) on your main router. First, turn on the tftp server, and point it to your USB storage:

config dnsmasq
     ...
     option enable_tftp '1'
     option tftp_root '/mnt/stor/tftp'

Make sure that second line you added points to the correct folder on your USB storage.

Add a static IP for the box you’ll flash:

config host
      option mac 'B8:27:EB:2B:08:85'
      option name 'somehost'
      option ip '192.168.1.240'

Change that MAC to your new router, and give it whatever name and address on its WAN you can remember. You won’t actually need it once it boots up, and you can delete this section once your new router is flashed.

Now, drop the file in the appropriate folder. For TFTP booted routers, you usually need two firmware images: one it can netboot off of from TFTP (which usually has “factory” in the name), and the real copy that gets flashed to the flash memory (usually has “upgrade”). This is a two step process – the netbooted image will not actually be saved to the router, and this is actually a great way to test an OpenWRT build before you flash. You then use the netbooted “factory” image to flash the router using the permanent “upgrade” image. If you don’t do that second step, when you reboot the router, it’ll go straight back to its original OS and settings from memory.

Now, the critical part – take that netboot image in your folder (mine is “openwrt-RB750gr3-GO2021-02-03-factory.bin” for the OpenWRT ROOter fork), and rename it “vmlinux”.

Some router manufacturers also need to find your TFTP server at a specific address, as well. Mikrotik apparently expects 192.168.1.10. If your LAN is already at 192.168.1.0 and the .10 address is free, it is trivial to add .10 as a second address for your main router (this will not affect anything else of significance). From your main router’s command line, simply run:

ip addr add 192.168.1.10 dev eth0.5

Change the bit after “dev” to match whichever interface or port connects to your LAN. In my case, my LAN is on a VLAN port of my router, hence eth0.5.

Now, it’s time to netboot. Shut down your new router if it isn’t already, and plug is WAN port into your existing network.

For the Mikrotik hEX, to trigger a netboot, you plug in the power jack with the reset button already held down. The button does several things depending on how long you hold it down; it comes on lit steady, then starts flashing, then goes off completely. Once it’s off completely you can release the button, as it will be looking for a netboot. If you’re watching your log on your main DHCP router, it’ll post up a log line when it sends the boot image to a DHCP client.

Give it time to boot up, and then try connecting from a client plugged into the LAN side of the new router. One advantage of doing it this way is that you don’t tie up your main computer as both the boot tftp server and the tool you need to log into the new router with. If your OpenWRT image netbooted successfully, you should find your new router at 192.168.1.1 from your test computer.

Now, for the last important part – flash the permanent image! You need to go to System -> Backup / Flash Firmware on the new router and flash that upgrade image, or what you’ve just done won’t stay permanent.

Dell Latitude E6410 with GPU overheating – Solved!

This one took stupidly long to sort out.

My work, which shall remain unnamed, had a pile of these Dell Latitude E6410’s for years, most of which worked adequately if never particularly well. They were quirky. They were slowly retired in favor of better equipment, and a handful were kept around as “emergency spares” until they got so out of date that they were finally kicked off the domain. They became off network utility machines until my IT folks couldn’t even keep Windows working on them anymore. The last one finally got officially “disposed” and handed to me to see if I can make anything useful for the office out of it, because I seem to be able to keep stuff alive.

Here’s the problem it had – you could get it to run for a few minutes, and then it’d just overheat and switch off. I switched it over to Debian because it’s a little lighter on resources (and we needed a spare linux box anyway), and that did improve things … slightly, for a year or so.

If you search “overheating E6410” on Google, you’ll see a pile of them, with almost no solutions. I did eventually conclude the thermal pads on its heat sink had died, and pulled it apart to replace the pads with decent thermal paste. This got us almost another year of usable performance out of it – the CPU performed well, but the GPU would still overheat if it did anything hard.

Finally, a year ago, it got back to just overheating the GPU after two minutes, and I stuffed it in a drawer until I had time to screw around with it.

I had a use for it this afternoon, and an hour to spare to look it back over, so I went ahead and pulled the heat sink back off to get a look. It’s easy to get to on these – one screw and the bottom cover slides off, two screws to remove the fan, and four to pull the rest of the heat sink. It’s one combined unit for the CPU and GPU:

laptop heat sink
Dell Latitude E6410 heat sink

I still had decent thermal paste that hadn’t hardened, and the radiator on the right hand end wasn’t clogged up. I could hear the fan working. But I finally spotted the problem – the GPU wasn’t making very good contact with this oddly shaped heat sink module! The CPU would purr along at 47C, and the GPU would shoot up to 95C and trigger a shutoff within minutes.

Since this machine was already “technically trash” and had one foot in the recycle bin, I said heck with it. The GPU is under the little, studded bit of the aluminum casting, right under where the copper heat bar reverses curvature. I pulled the assembly back out, took it in both hands, and just bent the heat sink bar. I bent it down in the middle as shown, so that with the radiator in its case slot on the right, and the CPU screws mounted, it might have a shot in heck of actually having the heat sink properly contact the GPU.

Turns out that’s all it needed. Now it’s sitting here with the GPU running at 47C as well, and it’s useful again. Not bad for a machine I was about two minutes away from drop kicking toward the recycle bin.

So, if you’ve been wading through the dozens of search results on overheating E6410s, and you’re at your wits end – pull the heat sink off and bend it to get better contact. It’s quick, easy, and you might well save your sanity, too.

Update 2021-03-17: This little machine has now been running for eight days straight, without a single GPU excursion over about 60C that I’ve noticed. This was just bad contact between the GPU and heat sink all along. Heck of a note … but it bodes well for getting a few more years use out of the thing!

ROOter GoldenOrb Hosting

We’re helping provide overflow hosting space for the wonderful team that keeps this OpenWRT fork going! However, during this morning’s transition, I hear a few people are having cache problems that have redirected them here to the blog front page, instead of the upload and build folders.

If that’s you, here are your direct links to the new folder locations:

http://www.aturnofthenut.com/upload/

http://www.aturnofthenut.com/builds/

Hopefully the redirect issues will clear up quickly. However, if you ever land on my front page accidentally, there will also always be a link at the top of the page with direct links.

Thanks for your patience!