blog dot Prolixium dot com – strings /dev/urandom|festival

Useless Linux Kernel Error Messages

I run lots of Linux-based software routers on my home network to route IPv4 and IPv6. Periodically, they freak out with some IPv6-related errors that seem to indicate a problem but there is no corresponding forwarding impact. Here are two of them:

(trill:18:56:EDT)% dmesg|tail
[40878917.324479] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878920.039800] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878920.875706] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878921.920218] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878924.426656] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878925.471213] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878926.515702] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878929.022420] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878930.066760] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.
[40878931.111439] ICMPv6: Received fragmented ndisc packet. Carefully consider disabling suppress_frag_ndisc.

I started getting this (code link) every few seconds for a week or so every few hours on a router that had 400+ days of uptime. From the sysctl documentation, suppress_frag_ndisc says:

suppress_frag_ndisc - INTEGER
	Control RFC 6980 (Security Implications of IPv6 Fragmentation
	with IPv6 Neighbor Discovery) behavior:
	1 - (default) discard fragmented neighbor discovery packets
	0 - allow fragmented neighbor discovery packets

I really shouldn’t have any fragmented ND packets on my network. Everything is 1500 MTU on this specific router, but since it’s the first hop for my general purpose Wi-Fi network, maybe there is a misbehaving device? Well, I would love to debug further but the printk does not include the MAC address or link-local source. So, my options are to tcpdump all ND until it happens again, maybe?

And now there’s this one:

(starfire:18:53:EDT)% dmesg|tail
[26897673.366452] neighbour: ndisc_cache: neighbor table overflow!
[26897673.366461] neighbour: ndisc_cache: neighbor table overflow!
[26897673.366475] neighbour: ndisc_cache: neighbor table overflow!
[26897673.366828] neighbour: ndisc_cache: neighbor table overflow!
[26897673.366839] neighbour: ndisc_cache: neighbor table overflow!
[26897673.366850] neighbour: ndisc_cache: neighbor table overflow!
[26897674.390436] neighbour: ndisc_cache: neighbor table overflow!
[26897674.390448] neighbour: ndisc_cache: neighbor table overflow!
[26897674.390460] neighbour: ndisc_cache: neighbor table overflow!
[26897674.390831] neighbour: ndisc_cache: neighbor table overflow!

I am getting this on a /core/ router that is the first-hop for a few segments and is transit as well. I’m pretty sure nothing is overflowing:

(starfire:19:02:EDT)% ip -6 nei|wc -l
43
(starfire:19:03:EDT)% ip -4 nei|wc -l
41

This also happens after a few 100s of days of uptime. Oh, but I would love to debug this but the log message doesn’t tell me what the limit is and my current usage. It also doesn’t tell me the last entry that was added and is (presumably?) dropped. So, I can’t really debug this at all. But, there seems to be no forwarding impact so I guess I’ll ignore it and searching the web either indicates this hard locks the CPU (not for me) or is a bug.

It would be really nice if printk messages would provide a little more information here.

ip monitor.. broke?

In both $dayjob and my personal networks, I use iproute2 extensively as it’s the main tool to interface with Linux’s Netlink socket API from the command line. In a pinch, to see live route & IP neighbor (ARP) or IPv6 neighbor (NDP) changes, I instinctively run ip mon and sit back expecting to see a bunch of messages scroll on the screen.

Prior to the latest update, it’d result in something like this for a lightly busy first hop router with constant ARP and NDP updates:

(trill:20:47:EDT)% ip -ts mon
[2025-04-27T20:47:30.873841] 10.3.6.224 dev br0 lladdr c2:b7:a7:94:d8:19 PROBE
[2025-04-27T20:47:30.876208] 10.3.6.224 dev br0 lladdr c2:b7:a7:94:d8:19 REACHABLE
[2025-04-27T20:47:31.113926] 10.3.6.108 dev br0 lladdr 70:61:be:36:58:ed REACHABLE
[2025-04-27T20:47:31.129783] 10.3.6.116 dev br0 lladdr 80:6a:10:18:05:cd PROBE
[2025-04-27T20:47:31.129850] 2620:6:2003:106:2ecf:67ff:fe19:16b4 dev br0 lladdr 2c:cf:67:19:16:b4 PROBE
[2025-04-27T20:47:31.130962] 2620:6:2003:106:2ecf:67ff:fe19:16b4 dev br0 lladdr 2c:cf:67:19:16:b4 REACHABLE
[2025-04-27T20:47:31.135276] 10.3.6.116 dev br0 lladdr 80:6a:10:18:05:cd REACHABLE
[2025-04-27T20:47:31.229838] 10.3.6.115 dev br0 FAILED
[2025-04-27T20:47:31.545864] 10.3.6.138 dev br0 FAILED
[2025-04-27T20:47:31.870086] 2620:6:2003:106:ba27:ebff:fe7c:788b dev br0 FAILED
[2025-04-27T20:47:31.870442] 2620:6:2003:106:eee:99ff:fe22:4b21 dev br0 FAILED
[2025-04-27T20:47:31.870839] 2620:6:2003:106:d272:dcff:febf:261c dev br0 FAILED
[2025-04-27T20:47:31.871257] 2620:6:2003:106:dea9:4ff:fe8b:dd95 dev br0 FAILED
[2025-04-27T20:47:31.871685] 2620:6:2003:106:662:73ff:fe66:9b4 dev br0 FAILED
[2025-04-27T20:47:31.901816] 2620:6:2003:106:52a6:d8ff:feb6:17f7 dev br0 lladdr 50:a6:d8:b6:17:f7 STALE
[2025-04-27T20:47:31.902021] 10.3.6.150 dev br0 lladdr 96:be:a5:e3:4e:c4 STALE
[2025-04-27T20:47:31.902171] 10.3.6.145 dev br0 lladdr c8:ff:77:63:bf:dd STALE
[2025-04-27T20:47:31.902342] 10.3.6.136 dev br0 lladdr 14:91:38:79:c9:c8 STALE
[2025-04-27T20:47:32.121889] 10.3.6.149 dev br0 FAILED
[2025-04-27T20:47:32.157843] 10.3.6.4 dev br0 lladdr b8:27:eb:d8:5f:69 PROBE
[2025-04-27T20:47:32.158468] 10.3.6.4 dev br0 lladdr b8:27:eb:d8:5f:69 REACHABLE
[2025-04-27T20:47:32.925806] 10.3.6.153 dev br0 lladdr 00:24:b1:0b:73:20 PROBE
[2025-04-27T20:47:32.927484] 10.3.6.153 dev br0 lladdr 00:24:b1:0b:73:20 REACHABLE
[2025-04-27T20:47:33.597925] 10.3.6.119 dev br0 FAILED
[2025-04-27T20:47:33.694007] 10.3.6.104 dev br0 lladdr 08:84:9d:d2:58:7c PROBE
[2025-04-27T20:47:33.694199] fe80::a84:9dff:fed2:587c dev br0 lladdr 08:84:9d:d2:58:7c router PROBE
[2025-04-27T20:47:33.696668] 10.3.6.104 dev br0 lladdr 08:84:9d:d2:58:7c REACHABLE
[2025-04-27T20:47:33.697061] fe80::a84:9dff:fed2:587c dev br0 lladdr 08:84:9d:d2:58:7c router REACHABLE
[2025-04-27T20:47:33.949761] fe80::14df:53d6:630:4d33 dev br0 lladdr c2:b7:a7:94:d8:19 STALE
[2025-04-27T20:47:33.949915] 2620:6:2003:106:96ee:f793:c700:7632 dev br0 lladdr 24:11:53:ce:04:2b STALE
[2025-04-27T20:47:33.950042] 10.3.6.113 dev br0 lladdr c8:ff:77:b6:a9:d9 PROBE
[2025-04-27T20:47:33.950142] 10.3.7.197 dev lxcbr0 lladdr 00:50:56:1a:ad:cf PROBE proto zebra
[2025-04-27T20:47:33.950222] 10.3.7.197 dev lxcbr0 lladdr 00:50:56:1a:ad:cf REACHABLE proto zebra
[2025-04-27T20:47:33.950445] 10.3.6.113 dev br0 lladdr c8:ff:77:b6:a9:d9 REACHABLE
[2025-04-27T20:47:34.397842] 2620:6:2003:106:6aff:77ff:feb6:a9d9 dev br0 FAILED
[2025-04-27T20:47:34.397982] 2620:6:2003:106:daa9:4ff:fe8b:dd95 dev br0 FAILED
[2025-04-27T20:47:34.398075] 2620:6:2003:106:f281:73ff:feeb:946e dev br0 FAILED
[2025-04-27T20:47:34.398156] 2620:6:2003:106:f218:98ff:feec:e96a dev br0 FAILED

And this for a router with a view of the IPv6 DFZ:

(daedalus:20:50:EDT)% ip -ts -6 mon
[2025-04-27T20:50:13.332562] 2402:e580:745a::/48 nhid 655447 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
	nexthop via fe80::a03:444 dev sit2 weight 1 
[2025-04-27T20:50:13.565449] 2a03:eec0:3212::/48 nhid 125 via fe80::a03:444 dev sit2 proto bgp metric 20 pref medium
[2025-04-27T20:50:14.052933] 2a12:a580::/29 nhid 627585 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
[2025-04-27T20:50:14.052991] 2a06:de01:861::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:14.053030] 2605:9cc0:c05::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:14.053162] 2a06:de05:63d6::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:15.155490] fe80::2a7:42ff:fe45:bc73 dev eth0 lladdr 00:a7:42:45:bc:73 router PROBE 
[2025-04-27T20:50:15.156345] fe80::2a7:42ff:fe45:bc73 dev eth0 lladdr 00:a7:42:45:bc:73 router REACHABLE 
[2025-04-27T20:50:15.683859] 2a06:de05:61c4::/48 nhid 81 via fe80::a03:7ee dev sit4 proto bgp metric 20 pref medium
[2025-04-27T20:50:15.861581] 2a03:eec0:3212::/48 nhid 125 via fe80::a03:444 dev sit2 proto bgp metric 20 pref medium
[2025-04-27T20:50:16.890712] Deleted 2605:9cc0:c05::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:16.890771] Deleted 2a06:de01:861::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:17.004830] 2a03:eec0:3212::/48 nhid 627631 proto bgp metric 20 pref medium
	nexthop via fe80::a03:7ee dev sit4 weight 1 
[2025-04-27T20:50:17.129794] 2a06:de05:61c4::/48 nhid 387 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
	nexthop via fe80::a03:7ee dev sit4 weight 1 
[2025-04-27T20:50:17.193829] 2a03:eec0:3212::/48 nhid 81 via fe80::a03:7ee dev sit4 proto bgp metric 20 pref medium
[2025-04-27T20:50:17.247879] 2a03:eec0:3212::/48 nhid 125 via fe80::a03:444 dev sit2 proto bgp metric 20 pref medium
[2025-04-27T20:50:17.489729] 2a12:a580::/29 nhid 88 via fe80::a03:443 dev sit5 proto bgp metric 20 pref medium
[2025-04-27T20:50:17.681168] 2a03:eec0:3212::/48 nhid 125 via fe80::a03:444 dev sit2 proto bgp metric 20 pref medium
[2025-04-27T20:50:17.961734] Deleted 2a06:de05:63d6::/48 nhid 80 via fe80::a03:449 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T20:50:18.357769] 2a03:eec0:3212::/48 nhid 125 via fe80::a03:444 dev sit2 proto bgp metric 20 pref medium
[2025-04-27T20:50:18.357974] 2a0c:b641:302::/47 nhid 441 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
	nexthop via fe80::a03:444 dev sit2 weight 1 
[2025-04-27T20:50:18.585064] 2a0c:b641:302::/47 nhid 441 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
	nexthop via fe80::a03:444 dev sit2 weight 1 
[2025-04-27T20:50:18.656761] 2a0c:b641:302::/47 nhid 441 proto bgp metric 20 pref medium
	nexthop via fe80::a03:449 dev sit1 weight 1 
	nexthop via fe80::a03:444 dev sit2 weight 1

So, after the latest Debian package update of iproute2 (6.14.0-3), I was surprised to find that ip mon appeared to not work anymore:

(concorde:19:49:CDT)% ip mon             
Failed to add ipv4 mcaddr group to list
(concorde:19:51:CDT)% ip -4 mon
Failed to add ipv4 mcaddr group to list
(concorde:19:51:CDT)% ip -6 mon
Failed to add ipv6 mcaddr group to list
(concorde:19:51:CDT)% ip mon all
Failed to add ipv4 mcaddr group to list
(concorde:19:51:CDT)% ???

It turns out that one now has to specify the type (nei, route, etc.) due to a patch that enabled multicast functionality for iproute2’s monitor command. I couldn’t get the multicast functionality working but I can replicate the old behavior with the additional arguments:

(concorde:19:57:CDT)% ip -ts -6 mon r
[2025-04-27T19:57:22.103476] 2a0f:7803:faf7::/48 nhid 135 via fe80::ae88:6002 dev sit1 proto bgp metric 20 pref medium
[2025-04-27T19:57:22.156228] 2001:661:4000::/35 nhid 36 via fe80::fc00:1ff:fe95:fbf8 dev eth0 proto bgp metric 20 pref medium
[2025-04-27T19:57:22.157505] 2a0f:9400:6110::/48 nhid 36 via fe80::fc00:1ff:fe95:fbf8 dev eth0 proto bgp metric 20 pref medium
[2025-04-27T19:57:22.274238] 2a06:de00:de03::/48 nhid 36 via fe80::fc00:1ff:fe95:fbf8 dev eth0 proto bgp metric 20 pref medium
[2025-04-27T19:57:22.385085] 2a0f:7803:faf7::/48 nhid 36 via fe80::fc00:1ff:fe95:fbf8 dev eth0 proto bgp metric 20 pref medium
[2025-04-27T19:57:22.578524] 2804:40:4000::/34 nhid 36 via fe80::fc00:1ff:fe95:fbf8 dev eth0 proto bgp metric 20 pref medium
[2025-04-27T19:57:23.095034] 2a06:de05:6151::/48 nhid 138 via fe80::a03:449 dev sit4 proto bgp metric 20 pref medium
[2025-04-27T19:57:23.095687] 2001:67c:20fc::/48 nhid 138 via fe80::a03:449 dev sit4 proto bgp metric 20 pref medium
[2025-04-27T19:57:23.095991] 2400:fc00:87e0::/44 nhid 138 via fe80::a03:449 dev sit4 proto bgp metric 20 pref medium
[2025-04-27T19:57:23.096256] 2a06:de01:97e::/48 nhid 138 via fe80::a03:449 dev sit4 proto bgp metric 20 pref medium

The fact that even ip mon all doesn’t work with this patch and still spits out the mcaddr error seems like a bug, to me, and I might submit a bug report if nobody has done so already.

Or, has everyone been always specifying route or nei and I’ve been the only idiot expecting results when omitting those objects?

Verizon Fios.. Latency Tiers?

My parents and I are both lucky to be in a Verizon Fios service area and get FTTH Internet, which is generally far superior to DOCSIS or xDSL when it comes to speed but also, more importantly, latency and jitter. My parents have had the service for about two decades and I’ve had mine since 2021, when I moved to VA and I started with the Gigabit (940/880 Mbps, actually) speed tier. My parents, however, have stuck with the 75/75 Mbps since they really haven’t felt the need for more bandwidth.

I was visiting them this weekend and along with installing a new Mac mini M4 for my mom as a late Christmas present, also helped them install their new Fios TV+ to replace their old (and I guess soon unsupported?) set top boxes. The Fios TV+ service makes use of Google Stream TV hardware & software on each TV, which all connect to a main Video Media Server through a combination of MoCA and Wi-Fi. The VMS receives ATSC through coax from the ONT. Anyway, this generally requires a Verizon-branded router if all it ends up doing is bridging MoCA and Wi-Fi. As I do in my home, I put the router behind my own router since I use home-grown Debian-based routers running various VPNs and iptables (see PCN for more details). Regardless, I replaced their ancient Actiontec MoCA router with the new G3100. The installation was pretty easy and now my parents have two Wi-Fi networks in their home: the main one and one dedicated for the Stream TV boxes.

When we activated the first Stream TV box and the VMS, their Internet service was automatically upgraded to the 940/880 Mbps service without a price difference (no ONT reboot required). I am not completely clear on what happened on the billing side of things but my dad seemed happy.

Various standard speed tests confirmed the new bandwidth tier was working as expected. I then looked at SmokePing just for kicks, not expecting to see anything different but got a surprise:

All SmokePing targets saw the latency and jitter drop.

This was unexpected for two reasons:

The link speed on my router was 1000Mbps/full-duplex before and after the upgrade. Although, even if it was 100Mbps/full-duplex before, that would not explain the drastic latency decrease of ~9 ms to ~5 ms since a 56-byte ICMP packet incurs 4.48 μs deserialization delay at 100 Mbps and 0.448 μs at 1000 Mbps. That’s not anywhere near the milisecond range.

From what I have seen, bandwidth tiers on residential Internet services are implemented as policers (token bucket) or shapers (leaky bucket), neither of which kick in until the traffic hits the limit, which was 75 Mbps symmetric before and 940/880 Mbps after. There certainly wan’t much of anything even close to 75 Mbps between 1000 and 1200 local time (graph is cut off at 10 Mbps to highlight this):

So, link speed was always 1000Mbps and we weren’t hitting the limiters prior to the bandwidth upgrade. Why did the latency change?

None of the hardware between my SmokePing node and the Fios network changed during this upgrade and the software (and uptime of the operating systems) didn’t change. My only explanation is that Verizon may be using something other than normal policiers and shapers for their bandwidth tiers, which impacts unloaded latency numbers.

If this is all true and there’s no other explanation for the latency change, this indirectly creates latency as well as bandwidth tiers. Latency is especially important for realtime applications such as A/V and gaming as well as the time it takes for TCP to increase its sliding window. The last point can impact page load times when multiple short-lived TCP connections are used to pull various page components.

Am I the only one who’s seen this?

2024 Review and Stuff

I forgot to do a 2024 review. It’ll be short and sweet and mostly visual but here it goes.

I flew a lot for work and pleasure:

We’re down to 1 dog, now (rip Henry Kamichoff on Christmas Day):

I doubled down on Intel for my last two “main computer” upgrades:

(destiny:18:49:EST)% ssh vega lscpu|grep Model.name
Model name:                           Intel(R) Core(TM) i9-14900
(destiny:18:49:EST)% lscpu|grep Model.name         
Model name:                           Intel(R) Core(TM) i9-14900KS

Next time it’ll be AMD-something-3D, I suppose. I’d really like it to be ARM, though. Maybe RISC-V?

I went to a few EDM shows. LSR/CITY v3 in Austin, TX:

Ultra Music Festival in Miami, FL:

Eric Prydz in Washington D.C.:

I already have Gareth Emery’s CYBERPUNK lined up for 2025!

Did I mention travel?

Seattle, WA, USA (6x times)
Cupertino, CA, USA (2x times)
Dublin, Ireland (2x times)
Sarasota, FL, USA (2x times)
North Brunswick, NJ, USA (2x times)
Sedona, AZ, USA
Miami, FL, USA
New York, NY, USA
Austin, TX, USA
Chicago, IL, USA
Boston, MA, USA
United Kingdom (England and Scotland)

We stayed home for the holidays, though. 2025 may be different, though.

That’s really about it, though. If I try to add more to this entry I’ll just put off publishing it.

A Tiny 2023 Review

I’m not all that interested in writing long year in review (YoR) articles anymore, so I’ll just create some bullets and a bunch of photos. The bullets come first.

I had three different managers at my place of employment.
I got on a plane to travel out of my home area a whopping 17 times, 7 for work and 10 of a personable nature. Notable destinations were Israel (1H of the year), Ireland, Las Vegas, Chicago, and of course Seattle & the Bay Area.
I switched my primary vehicle to a non-Tesla EV.
I was promoted at my place of employment.
I attended concerts by Gareth Emery and P!nk.
My wife and I did not travel for Thanksgiving, Christmas, or New Years.
I returned to the office. My commute time ranges from 40 to over 90 minutes depending on my mode of transportation and various other timings.
I started drinking espresso shots.
I upgraded my main workstation to an Intel Core i9-13900KF CPU with 128GiB of RAM and a Nvidia GeForce RTX 3060 GPU.
I had an irrigation system installed for our end unit townhome. This installation resulted in 1 week of no running water in our home due to a valve that is part of the fire suppression system, which I blame squarely on the builder.
I gained 4.9 kg.
I saw three movies.

Here are some images:

I visited Washington D.C. a couple of times since it’s nearby.

The Sphere had just been turned on when I was in Las Vegas

DC Metro after the P!nk concert at Nationals Park

“Cask Mates” at the 1608 Bar in Quebec City, Canada

Intel Core i9-13900KF with Noctua cooler and Nvidia RTX 3060 GPU

The Mediterranean Sea from Carmel Beach in Haifa, Israel

That about it, I suppose. In actuality I got tired of processing photos.

I also started listening to melodic techno instead of a steady stream of trance and house music. I also have a whopping 4x EDM shows scheduled through March 2024 already!

FreeBSD 14.0

FreeBSD isn’t dead. Nope.

I run this webserver (dax.prolixium.com, a VirtualBox VM running on a bare metal server that I lease from Vultr in Parsippany, NJ) on FreeBSD as well as another test VirtualBox VM at home, trance.prolixium.com. Although it’s a less-than-scientific development environment, I usually try FreeBSD upgrades on trance before upgrading dax.

I did the trance upgrade from 13.2 over the weekend during some evening down time I had while in Sarasota, FL (visiting a family member). I did a source upgrade of the base system with a binary package upgrade. I used to do a 100% source upgrade, which included FreeBSD ports. However, over time the /build/ dependencies spiralied out of control (think multiple versions of llvm..) so I switched to binary packages.

The only quirks I ran into was that the first installworld failed with some ld-elf.so.1: /usr/bin/make: Undefined symbol "__libc_start1@FBSD_1.7" error. I ran installworld a second time and it worked. I’m guessing there’s some race condition or dependency error (I am using make -j1). Some folks on Lily indicated this is par for the course, although I haven’t encountered it before and I’ve been running and upgrading FreeBSD since 4.8. The other quirk is that openssh-portable seems to be gone from the available binary packages. As a result, pkg upgrade -f didn’t reinstall it to link to new libs and I ran delete-old- libs. After the reboot sshd was not running due to a libcrypto mismatch. I had to VNC in and switched to the OpenSSH from base. The only reason I used openssh-portable originally was because the OpenSSH version in base always lagged behind considerably but this doesn’t seem to be the case anymore since it’s 9.5. I’m still curious why openssh-portable was removed from the available binary packages—a quick wed search didn’t turn up anything interesting. I still see it in ports so I guess I could still build it from source if I wanted.

The trance VM runs WireGuard and FRR since I’ve had problems with that combination in the past (also OpenVPN and Quagga). I figured it’s a good soak and /fairly/ representative of what I run on dax, which I suppose I’ll upgrade in the next couple of weeks.

MPV, PulseAudio, and Laggy Video

I switched over to MPV from MPlayer a year or two ago and haven’t looked back. However, after I upgraded my Intel Core i9-9900K with an RTX 2060 to A i9-13900KF with an RTX 3060 earlier this year I noticed videos played with MPV were very laggy, to the point where the FPS would be much slower and it’d have to drop frames to keep up with the audio. VLC and browser-based video was unaffected. It was just MPV.

I did all sorts of debugging, which included changing video drivers (--vo), decoders (--vd), and even playing with CPU scaling on the system (to the point where I even tried writing 0 to /dev/cpu_dma_latency) but nothing helped. The odd part is that some video files weren’t affected by this, specifically videos from my iPhone (H.265 in MOV container). I mostly switched to VLC, as a result, and moved on with life.

Today I came across an article where someone was troubleshooting a similar laggy video issue with MPV and even though the problem described in the thread was not related to mine, one of the troubleshooting steps helped me solve mine.

I run PulseAudio because some things like to use it but if I have the choice, I tell applications to run directly off the ALSA API. MPV uses PulseAudio by default over ALSA. It turns out that there is a slight difference in the audio controller on the motherboard on my i9-13900KF system (MSI MPG Z790 Carbon Wi-Fi) that causes outputting to PulseAudio to result in the horrible latency and lag. The fix was simple: --ao=alsa (or ao=alsa in mpv.conf).

Yes, I still feel like a jive turkey after all of these years:

Possible Problems with udev on Raspberry Pi

I have a dozen or so Raspberry Pis on my network that are used as routers, environmental sensors, looking glasses, and lab devices. They vary in models but I perform periodic upgrades to keep them running, which some have been for over 10 years now. I’ve recently realized I should be using SD cards with higher write counts on the RPis that run SmokePing so those are getting upgraded with priority. Many of these RPis are remote and headless so performing upgrades can be tricky. I’ve bricked one or two of them over the years and the fix was usually simple (OpenVPN / WireGuard issues or software bugs) but I’ve come across one recently that I can’t make heads or tails on!

I visited my parents in NJ last weekend and did a replace & clone of the SD cards in one of their RPis (a 3 B unit) that acts as a router on a stick (two VLAN-tagged interfaces running off eth0). The clone worked fine and I decided to do some upgrades before heading back to VA. I upgraded *udev*, *dbus*, *systemd*, and *raspberry*, which usually pulls in all the packages that typically require or recommend a reboot, which I figured I should do when I’m local. I track the equivalent Debian testing release on Raspbian / Raspberry Pi OS, which is currently bookworm. Unfortunately, this appeared to brick the box upon reboot. However, what was weird is that networking came up about 2 minutes after the reboot, which is very slow, but that’s it. SSH never started and neither did FRR (Free Range Routing, which starts OSPFv2, OSPFv3, and BGP) or the DHCP relay so while I could ping the interfaces the box failed to perform its most basic function. It seems I had a problem.

I connected the RPi to a TV and USB keyboard to debug further and found that some (actually all, I’d find out later) block devices were not being registered correctly with udev and causing systemd to drop to a emergency mode:

At first I thought that fsck was just taking awhile and needed more time so I added x-systemd.device-timeout=300s to /etc/fstab for both of the device nodes, /dev/mmcblk0p1 (boot) and /dev/mmcblk0p2 (root). Both are specified as raw device nodes and not UUIDs in my /etc/fstab file, which seems alright for now but probably something I should change in the future in case device nodes start moving around or getting renamed. I had to make this change by connecting the SD card to another Linux box in the house and mounting the filesystem.

This didn’t do anything. I changed the timeout a few more times and eventually gave up on this approach.

This is where things get annoying. I also tried to hit Control-D as specified but that just resulted in it sitting there for 90 seconds and returning to the emergency mode with the same prompt. I also tried to enter my root password (yes, it’s set!) but it wouldn’t take the password. I verified the password hash on another working RPi and could su to root just fine with the password.

So, I had no way of getting this RPi booted, using hacks or otherwise. The Raspberry Pi bootloader isn’t like GRUB and doesn’t have an interactive menu where one can play with the kernel command line options on the fly. It literally has no menu or interface and just boots one of the /boot/kernel*.img files, depending on the platform. If the boot fails, the RPi will just hang and display a color wheel on the screen. I thought about editing /boot/cmdline.txt and adding init=/bin/sh to the command line but I figured that would just bypass the problem and not allow me to actually figure out what was going on.

Short on time, I re-cloned the SD card from the original (unaltered) image I had saved on another machine (yes, I thought ahead!) and got the router back up & running without doing any upgrades.

After I returned home I re-created the setup with a spare RPi 3 B+ (similar hardware, although I found that a 1 B exhibited the problem too) and the original SD card image. Instead of upgrading all the reboot-required packages at once I did a few at a time. I started with *udev* and this triggered the issue. Here’s the exact upgrade path:

The following additional packages will be installed:

The following additional packages will be installed:
  libblkid1
The following packages will be upgraded:
  libblkid1 libudev1 udev
3 upgraded, 0 newly installed, 0 to remove and 491 not upgraded.
Need to get 1,789 kB of archives.
After this operation, 1,041 kB of additional disk space will be used.
Do you want to continue? [Y/n] 
Get:1 http://mirror.umd.edu/raspbian/raspbian testing/main armhf libblkid1 armhf 2.38.1-5 [131 kB]
Get:2 http://raspbian.raspberrypi.org/raspbian testing/main armhf udev armhf 252.6-1+rpi1 [1,559 kB]
Get:3 http://raspbian.raspberrypi.org/raspbian testing/main armhf libudev1 armhf 252.6-1+rpi1 [99.1 kB]
Fetched 1,789 kB in 2s (1,137 kB/s)
(Reading database ... 59747 files and directories currently installed.)
Preparing to unpack .../libblkid1_2.38.1-5_armhf.deb ...
Unpacking libblkid1:armhf (2.38.1-5) over (2.36-3) ...
Setting up libblkid1:armhf (2.38.1-5) ...
(Reading database ... 59748 files and directories currently installed.)
Preparing to unpack .../udev_252.6-1+rpi1_armhf.deb ...
Unpacking udev (252.6-1+rpi1) over (247.2-5+rpi1) ...
Preparing to unpack .../libudev1_252.6-1+rpi1_armhf.deb ...
Unpacking libudev1:armhf (252.6-1+rpi1) over (247.2-5+rpi1) ...
Setting up libudev1:armhf (252.6-1+rpi1) ...
Setting up udev (252.6-1+rpi1) ...
Processing triggers for libc-bin (2.36-8+rpi1) ...
Processing triggers for man-db (2.9.3-2) ...
Processing triggers for initramfs-tools (0.139) ...

I was able to view system.journal with journalctl --file on another machine to look at the logs and found some interesting stuff. Mainly, it looks like udev is running into errors with a rule specifying ID_SEAT for every single device node on the machine:

Apr 21 11:45:20 mercuryold systemd-udevd[159]: Using default interface naming scheme 'v252'.
Apr 21 11:45:20 mercuryold systemd[1]: Started systemd-udevd.service - Rule-based Manager for Device Events and Files.
Apr 21 11:45:20 mercuryold (udev-worker)[160]: vcs2: /usr/lib/udev/rules.d/73-seat-late.rules:13 Failed to import properties 'ID_SEAT' from parent: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[160]: vcs2: Failed to process device, ignoring: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[161]: vcsu2: /usr/lib/udev/rules.d/73-seat-late.rules:13 Failed to import properties 'ID_SEAT' from parent: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[161]: vcsu2: Failed to process device, ignoring: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[162]: vcsa2: /usr/lib/udev/rules.d/73-seat-late.rules:13 Failed to import properties 'ID_SEAT' from parent: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[162]: vcsa2: Failed to process device, ignoring: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[161]: vcsa3: /usr/lib/udev/rules.d/73-seat-late.rules:13 Failed to import properties 'ID_SEAT' from parent: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[161]: vcsa3: Failed to process device, ignoring: Operation not permitted
[...]

mmcblk0 and friends are included here:

Apr 21 11:45:20 mercuryold (udev-worker)[165]: mmcblk0: /usr/lib/udev/rules.d/73-seat-late.rules:13 Failed to import properties 'ID_SEAT' from parent: Operation not permitted
Apr 21 11:45:20 mercuryold (udev-worker)[165]: mmcblk0: Failed to process device, ignoring: Operation not permitted

And then, of course, this is what makes systemd actually unhappy:

Apr 21 11:48:46 mercuryold systemd[1]: dev-mmcblk0p1.device: Job dev-mmcblk0p1.device/start timed out.
Apr 21 11:48:46 mercuryold systemd[1]: Timed out waiting for device dev-mmcblk0p1.device - /dev/mmcblk0p1.
Apr 21 11:48:46 mercuryold systemd[1]: Dependency failed for boot.mount - /boot.
Apr 21 11:48:46 mercuryold systemd[1]: Dependency failed for local-fs.target - Local File Systems.

So, something was busted in udev. I took a peek at 73-seat-late.rules and it’s pretty stock and the same on all of my systems:

#  SPDX-License-Identifier: LGPL-2.1-or-later
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

ACTION=="remove", GOTO="seat_late_end"

ENV{ID_SEAT}=="", ENV{ID_AUTOSEAT}=="1", ENV{ID_FOR_SEAT}!="", ENV{ID_SEAT}="seat-$env{ID_FOR_SEAT}"
ENV{ID_SEAT}=="", IMPORT{parent}="ID_SEAT"

ENV{ID_SEAT}!="", TAG+="$env{ID_SEAT}"
TAG=="uaccess", ENV{MAJOR}!="", RUN{builtin}+="uaccess"

LABEL="seat_late_end"

Line 13 is:

ENV{ID_SEAT}=="", IMPORT{parent}="ID_SEAT"

I did some searching and didn’t find any hits on any search engines that references this type of error or failure. What’s worse is that I poked the udev versions on some of my other RPis and it’s the same as the supposed broken version that I upgraded to, 252.6-1+rpi1. That version is also the most current version in Debian as well (minus experimental).

I then tried to see if I could get a shell on the machine somehow to troubleshoot it in the bad state. I tried to get sshd to start earlier in the boot process but after messing with the dependency directives in ssh.service and sshd.service a few times and that didn’t change anything, I gave up on it for now.

(destiny:19:23:EDT)% head ./system/sshd.service
[Unit]
Description=OpenBSD Secure Shell server
Documentation=man:sshd(8) man:sshd_config(5)
#After=network.target auditd.service
After=systemd-remount-fs.service
ConditionPathExists=!/etc/ssh/sshd_not_to_be_run

[Service]
EnvironmentFile=-/etc/default/ssh
ExecStartPre=/usr/sbin/sshd -t

(the above didn’t so squat but I’m probably doing it wrong or I need to manually run whatever systemd edit does usually after one changes a unit file)

I’m at a stopping point for this now. I suppose the next thing I might try is re-cloning the SD card again and doing a full upgrade instead of just udev or just reboot-required packages to see if there’s some missed dependency somewhere that doesn’t result in the breakage, but that’s a long shot especially since any udev dependency would either be kernel or systemd-related, and I upgraded those and it didn’t help. I also need to figure out why entering my root password doesn’t get me into the system—that’s actually more troubling than the other issue.

For those who are reading this post, have you seen this before or have any suggestions to try?

Update on 2023-05-01

I ended up figuring this out last week, mostly. I still don’t know why just upgrading the udev package bricked the RPi but I do know that the raspberrypi-kernel package will fail to install due to insufficient disk space but not return an error code to dpkg, so if someone isn’t watching the terminal closely, this could be missed.

This was part of my problem. /boot was only 60 MiB on this machine (it was an older image, my newer RPi installs have a 253 MiB /boot) and the kernel install failed but the modules in /lib/modules succeeded. The existing kernel image was untouched but its modules were no longer present. So, upon reboot, lots of things didn’t work at all since there were no loadable modules.

To avoid the nastiness of resizing /boot, I ended up just unmounting /boot, mounting it at /realboot, rsync’ing contents over to /boot (which is actually temporarily part of /), doing the kernel upgrade, then rsync’ing /boot back to /realboot, and unmounting & remounting things again. This only works because during the install of raspberrypi-kernel, you need much more than 60 MiB in /boot, but when the upgrade is finished, 60 MiB is sufficient (barely):

Filesystem      Size  Used Avail Use% Mounted on
/dev/mmcblk0p1   60M   56M  4.8M  93% /boot

The fix here is to not upgrade udev by itself (I only did this as a test) but always upgrade that and the kernel at the same time. So, if I had done the /boot trick originally, I wouldn’t have had any issues.

Good Night, JNCIE

About a week ago I decided to throw in the towel on my Juniper Networks certifications, namely the only one I have left, the JNCIE-SP.

I passed the JNCIE-M lab in Herndon, VA, USA on 2010-09-24 (age: 29) and it was converted to the JNCIE-SP on 2012-05-19. I spent count countless hours building and manipulating a Junos (when did they change it from JUNOS?) virtual lab with Olives and tons of logical routers (nowadays called LSYS) and more VLAN tags than a single fxp0 control plane interface was ever meant to carry to learn the ins and outs of things like MPLS TE, multicast, and BGP. I did most of the lab work and studying at the Panera Bread in Ballantyne, which was a short walk from the condominium I owned in Charlotte, NC between 2005 and 2014. No, I didn’t gain any weight during this period because it was typically after work and also after a trip to the Mecklenburg Aquatic Center, where I swam countless laps.

My lab setup started with a bare metal eMachines Celeron-based PC running a version (8.4?) of Junos Olive, which consisted of the Junos control plane and the FreeBSD kernel as a data plane(-ish). It was a horrible hack and the only way to add more nodes was to create logical-routers and connect them together with VLAN tags on the only interfaces that existed on the image, which were from the fxp driver in FreeBSD. I also made use of some MX240s that were set aside for “testing” in the lab at my current job (Time Warner Cable) to gain exposure to some data plane things that the Olive didn’t support like CoS.

I passed the JNCIP-M lab in Sunnyvale, CA, USA on 2009-09-16 and found it fairly easy (I was done in under half the time allocated). The JNCIE-M was quite a bit more complex (IS-IS, MPLS-TE, SONET, and other stuff) and if I remember corectly I took most of the time allocated and then YOLO’ed the submit button. Here’s a photo of the office on the day of the exam:

Juniper Networks office in Herndon, VA, USA

I ended up passing it on the first try. This was in stark comparison to my experience with the JNCIE-SEC, which I failed twice and decided to not pursue further.

The Junos lab has changed a bit over the years but is still technically running! It’s now powered by vMX and some SRX hardware running in both packet and flow mode and is documented here (yes, it’s up to date as of this writing!). As far as the non-Junos systems, VMs changed to LXCs, Dynamips changed to IOSv, NX-OSv, and IOS XRv. No, I never used GNS3 or anything like that. The lab is fully-built from my own scripts that call KVM, VirtualBox, and Linux’s networking and container things like bridges and LXCs.

For awhile I carried both the JNCIE-SP and JNCIP-SEC. I ended up giving up on the JNCIP-SEC in 2019 and that certification expired on 2019-07-16. The last time I really touched firewalls in a production environment was in 2013 so that certification was fairly useless. I kept recertifiying JNCIP-SEC until 2023. My scores kept falling since I didn’t really work with any Juniper Networks gear past 2014 or 2015. Although I initially scheduled the JNCIP-SP for May of 2023 I ended up cancelling it because I figured dragging it on and on wasn’t worth the expense (my current employer would still continue to pay the $400/exam) or the studying.

I suppose it’s an end of an era. Although, I still have lots of bare-metal Junos-based boxes at home that I find myself tweaking periodically. I suppose it’s 4x EX2200-Cs, 3x SRX210s, and 1x SRX300. I’m not counting the NetScreen-5 that I still keep online, which runs an ancient version of ScreenOS (NetScreen was acquired by Juniper Networks in 2004).

Nowadays my day job consists of some amount of traditional networking, project management, and business development. The days of being a command-line warrior are mostly behind me.

I will still be a fully-certified JNCIE until 2023-08-17 where I will be designated a JNCIE Emeritus, which practically means nothing. Effectively, this date will be when the sun sets on my JNCIE certification (AKA good night).

Core i9-13900KF Upgrade

I finally decided to upgrade my Linux workstation this past week. I went for the following specifications:

Intel Core i9-13900KF CPU
MSI GeForce RTX 3600 GPU
MSI MPG Z790 Carbon Motherboard
128GiB Corsair Dominator Platinum DDR5 5200MHz RAM
Noctua NH-D15S CPU Cooler
Samsung NVMe 980 PRO 2TB SSD

And, I reused the following from my previous build (originally a Core i9-9900K):

Corsair Carbide 200R Case
Corsair RM850x Power Supply
Crucial SATA MX500 4TB SSD
2x Western Digital WDC WD40EZRZ-22GXCB0 4TB HDD via USB
Pioneer SATA XL BD-RW
LG SATA BD-DR
Hauppage PCI-e 4x DVB/HDTV Tuner
USB 3.0 PCI-e Adapter

I didn’t go for water cooling and also used the NA-RC7 “low noise” adapter to make the CPU fan spin slower and therefore not make as much noise. I wasn’t going to overclock so I figured this would be fine since the NH-D15S was a beast of a heatsink. I don’t game at all but wanted a mid-range GPU in case I decided to do anything more interesting than Google Earth and I picked the i9-13900KF because it has the best single-thread performance (the criteria for my last build, too):

I still don’t know why the KF is faster than the K variant. K means unlocked multipler and F means without integrated graphics. Supposely since there is no heat & power consumed by the GPU on the KF series, it can overclock more than the K variant. I’m not sure if it fully explains it, though. The i9-12900K slightly edges out the i9-12900KF farther down the list but that is well within the margin of error of PassMark’s testing, I’d think.

The machine doesn’t actually sit on my desk so I don’t care about any kind of flashy RGB stuff but it seemed to be impossible to find premium RAM that didn’t have some of LEDs on it (in addition to the motherboard). Here’s the Corsair RAM being “blingy”:

The build went fine, overall. I think I could have done a better job applying thermal paste but meh. The MSI BIOS quickly indicated that all my stuff was working as expected. I did notice that the RAM frequency was 4000 MHz but the RAM itself was spec’ed at 5200 MHz. I found out later that the 5200 MHz is a [sanctioned] OC specification (Intel XMP) so I’m fine with 4000 MHz as long as things are stable.

I did end up changing to legacy boot because I didn’t see any reason to change from grub-pc to grub-efi (I have no use for secure boot). The MSI BIOS flipped some other options when I did that:

Flipped on legacy boot mode and some other options came with it

I initially booted my Debian install from the original NVMe SSD connected by a USB converter, which surprisingly went very well (albeit a bit slow). I then used a Knoppix live ~~DVD~~USB to clone the first few MiB of the disk (for GRUB) and then recreated all filesystems for the 2TB SSD and rsync’ed content over (and.. forgetting the -p in rsync, so I had to flip the setuid bit on ping and mtr!).

The Noctua cooler works fairly well although things get pretty toasty if I load up 32 processes of burnP6 and let it sit for a few minutes:

There are a few interesting things above that I noticed after the fact. First, the way the 16x E-cores vs. 8x P-cores are enumerated in Linux is interesting. The P-cores are listed first and are core IDs 0,4,8,12,16,20,24,28. The E-cores are 32 through 47. I don’t know why the P-cores skip 4x IDs but the sysfs enumeration is even weirder because it breaks out threads, which are only supported in the P-cores.

(destiny:20:57:EST)% for i in $(seq 0 31); do echo -n "${i}: "; echo -n "Core ID #"; cat /sys/devices/system/cpu/cpu${i}/topology/core_id; done         
0: Core ID #0 // P-core
1: Core ID #0 // P-core
2: Core ID #4 // P-core
3: Core ID #4 // P-core
4: Core ID #8 // P-core
5: Core ID #8 // P-core
6: Core ID #12 // P-core
7: Core ID #12 // P-core
8: Core ID #16 // P-core
9: Core ID #16 // P-core
10: Core ID #20 // P-core
11: Core ID #20 // P-core
12: Core ID #24 // P-core
13: Core ID #24 // P-core
14: Core ID #28 // P-core
15: Core ID #28 // P-core
16: Core ID #32 // E-core
17: Core ID #33 // E-core
18: Core ID #34 // E-core
19: Core ID #35 // E-core
20: Core ID #36 // E-core
21: Core ID #37 // E-core
22: Core ID #38 // E-core
23: Core ID #39 // E-core
24: Core ID #40 // E-core
25: Core ID #41 // E-core
26: Core ID #42 // E-core
27: Core ID #43 // E-core
28: Core ID #44 // E-core
29: Core ID #45 // E-core
30: Core ID #46 // E-core
31: Core ID #47 // E-core

I’ve annotated which is a P-core vs. E-core. I’m still not clear on how the Linux kernel really decides what tasks to throw at E-cores vs. P-cores and while looking at htop as I use the workstation it seems that everything’s just treated equally. Maybe it’s because the INTEL_HFI stuff is not fully integrated yet. I did notice that the 6.0.12 kernel that’s current on Debian testing at time of writing does not have INTEL_HFI_THERMAL enabled, which might help (or make things worse since the E-cores run at a lower clock speed?). I’ve played around with turning on / off all of the E-cores and most of the P-cores (minus cpu0, which is a P-core, and cannot be disabled) but haven’t really concluded anything concrete about powersaving vs. performance.

Second, this is the first time I’ve seen a core on a desktop PC of mine reach 100°C. I’m guessing that this resulted in some throttling (cpuinfo shows 5478.906 MHz for that core ID so I’m not sure how much). Maybe if I had opted for water cooling (or removed the “low noise” adapter!) it wouldn’t have gotten so hot.

While I’m not going to use this sytem for gaming I did notice that the RTX 3060 is crippled and will detect ETH mining:

(destiny:21:09:EST)% lspci|grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)

Apparently only the first RTX 30xx cards produced did not have this restriction but all of the current ones do. I don’t really care but I don’t like the hardware I buy to be encumbered for silly reasons.

All-in-all this feels like a good upgrade and should last 4-5 years like my last i9-9900K build, which was done toward the end of 2018.