It said that unlike Arping 2.09, in Arping 2.11 the ARP cache was not updated after successful reply. I thought that was odd, since there’s no code to touch the ARP cache, neither read nor write. Surely this behaviour hasn’t changed?
I tried to reproduce the behaviour and sure enough, with Arping 2.09 the arp cache is updated, while with 2.11 it’s not.
$ arp -na | grep 192.168.0.123 $ # --- First try Arping 2.11 --- $ sudo ./arping-2.11 -c 1 192.168.0.123 ARPING 192.168.0.123 60 bytes from 00:22:33:44:55:66 (192.168.0.123): index=0 time=1.188 msec --- 192.168.0.123 statistics --- 1 packets transmitted, 1 packets received, 0% unanswered (0 extra) $ arp -na | grep 192.168.0.123 $ # --- Ok, that didn't change the ARP cache. Now try 2.09 --- $ sudo ./arping-2.09 -c 1 192.168.0.123 ARPING 192.168.0.123 60 bytes from 00:22:33:44:55:66 (192.168.0.123): index=0 time=794.888 usec --- 192.168.0.123 statistics --- 1 packets transmitted, 1 packets received, 0% unanswered (0 extra) $ arp -na | grep 192.168.0.123 ? (192.168.0.123) at 00:22:33:44:55:66 [ether] on wlan0
How could that be? I suspected that maybe the kernel saw the ARP reply, and snooped it into the ARP table. But I quickly confirmed that the packets going over the wire were the same for 2.09 and 2.11 (as they should be).
So what changed between 2.09 and 2.11?
$ git log --pretty=oneline arping-2.09..arping-2.11 | wc -l 43
Ugh. Before doing a bisection I skimmed through the descriptions. Most were comments, compile fixes and documentation. The only functionality changes were
- Switching to
clock_gettime()(various patches). Read gettimeofday() should never be used to measure time for why.
- Switching to
- Adding support to use
getifaddr()to find the correct output interface
Well, the first two don’t look suspicious, so either it’s the
getifaddrs() or some minor change that shouldn’t have
Between Arping 2.09 and 2.10 I changed the interface finding code from
an ugly hack of running
/sbin/ip route get 188.8.131.52 to get the
outgoing interface from the routing table. Since the output of the
various “show me the routing table” commands are different in
different OSs, I had to implement this subprocess (ugly) and parsing
(ugly) several times. The new implementation uses
getifaddrs() to traverse the interfaces programmatically.
The old code was still there as a fallback. It would never actually
get used unless there’s a Linux system out there that doesn’t have
getifaddrs(). It seems it was added to glibc 2.3 back in
when Arping was two years old. Anyway it was trivial to temporarily
switch interface selection back to the old method. I confirmed that
this was indeed what caused this change of behaviour.
ip route get doesn’t send an ARP request and
populates the ARP cache when it gets the reply? No. So if
route get 184.108.40.206 doesn’t do it, and
220.127.116.11 doesn’t do it, then surely
ip route get 18.104.22.168
; arping-2.11 22.214.171.124 doesn’t do it?
Yes, yes it does.
ip route get 126.96.36.199 followed by
188.8.131.52 to show up in the ARP cache. And it doesn’t
ip route get is run as an ordinary user or as root!
(arping of course has to run as root or have
Only the exact address given to
ip route get will be “open to be
filled” by the second command, so it seems to be per address, and that
ip route get will modify state in the kernel.
$ arp -na | grep 192.168.0.123 $ sudo ./arping-2.11 -i wlan0 -q -c 1 192.168.0.123 $ arp -na | grep 192.168.0.123 $ # --- Ok, still no entry in the ARP cache Now try running both commands --- $ ip route get 192.168.0.123 ; sudo ./arping-2.11 -i wlan0 -q -c 1 192.168.0.123 192.168.0.123 dev wlan0 src 192.168.0.100 cache mtu 1500 advmss 1460 hoplimit 64 $ arp -na | grep 192.168.0.123 ? (192.168.0.123) at 00:22:33:44:55:66 [ether] on wlan0
I closed the bug since it’s working as intended.
I have not dived into the kernel source to find the reason for this, but I may come back and update this post if and when I do.