I’ve been coding more on my rust SDR framework, and want to improve my ability to send/receive data packets efficiently and reliably.
There are two main ways I use learn to do this better: designing a new protocol, and making the best implementation possible for an existing one. This post is about refining the latter.
AX.25 and APRS
First a detour, or background.
Layer 3 (IP stack equivalent: IP itself) consists of the ability to
add, in addition to source and destination, a variable number of
intermediate repeaters. This allows limited source routing. In APRS
the repeaters are usually not named, but instead uses “virtual” hops
Layer 4 (IP stack equivalent: TCP and UDP) allows both connected and disconnected communication channels. In my experience connected AX.25 works better over slow simplex radio than TCP. If TCP was ever optimized for high delay low bandwidth, it’s not anymore.
For the physical layer, there are three main “modems”:
1200 baud bell 202, used on 144.800Mhz in region 1, and 144.390Mhz in region 2, for APRS. It’s used by BBSs and other applications too. This is by far the most common amateur radio modem, and is often just called “packet”. Anything that supports “APRS” will support 1200 baud.
9600 baud G3RUH. This is is implemented in some radios that already support 1200 baud, such as Kenwood TH-D74 and Yaesu FT5D. There are also dedicated hardware TNCs for it.
I say “baud” (symbols per second), but since all these use one bit per symbol, you can equally call it “bps”.
300 and 1200 bps both use two audio tones. They’re almost exactly as simple as you think they are. The current tone is FM modulated, and transmitted. It has a very distinct sound if you hear it over the air.
The receiver then FM demodulates as usual, and then does another demodulation, this time of the audio frequency. Because the second demodulation outputs binary, the second demodulation is usually called FSK demodulation instead of FM demodulation. So it’s FSK inside FM.
It’s worth digging into detail about how this works, to see how it’s different from 9600bps.
An FM radio can be seen as a device that takes an analog signal
input, and the tuned frequency
Y, and outputs a solid carrier at
If the analog signal
X is, say, the constant positive number 6, then
it outputs a solid carrier (not an FM modulated signal) on frequency
Y+6. Units don’t matter for this explanation, it only matters that
constant number as input means constant pure carrier at some
When the analog input is speech, which it usually is, or a sine wave
(solid tone), this means that the transmitted carrier frequency
is something like
Y + amplitude*sin(t) (for sine wave), or
s[n] (for the speech sound wave
So 300/1200 baud modems are:
[ 0 or 1 ] | V [ audio frequency A or B ] | V [ FM modulator ] | V [signal around Y Hz]
9600 bps is different. It’s direct FSK over the air. This can’t be sent as audio to most radios, since there is no audio spectrum in the first place.
[ 0 or 1 ] | V [ FM modulator ] | V [signal either at Y+X or Y-X Hz]
Of course it has to be filtered a bit before going out, since instant transitions between two frequencies would splatter too much, but that’s a detail not worth getting into further here.
This may explain why 9600 baud is much more rare. You have to trick the radio into accepting an input signal that is definitely not audio, and treat it as audio, without filtering away what looks like high frequencies. The input doesn’t have “frequencies”; it’s not audio!
For example, if you send a bunch of
+X values in a row to the radio,
the radio will just “think” that the microphone has a different
reference to ground, and it’ll treat it as 0. I say “think”, but in
hardware this could just be a capacitor, which will filter this
offset. (I’m glossing over details)
Put another way: Just because something is values that can be put inside a .wav file, that doesn’t make it audio.
Some people do hardware modifications to their radios to make them accept this direct binary input to the FM modulator. Most of the documentation out there about 9600 packet radio assumes you’ll be doing that, which makes this a bit confusing.
This is not what interests me. Either I’ll use a radio with 9600 built in, or I’ll use a software defined radio which already gives me full spectrum access.
The lovely Kenwood TH-D74 provide a whole 9600 baud modem (TNC), so that you can send whole AX.25 frames and it’ll modulate and transmit it for you. No modding fake analog signal trickery needed.
My goal today
I want to improve my 9600bps receiver code. In order to know when it gets better, I have to have sample inputs. For 1200bps there’s a standard CD with over a thousand packets captured over the air, from various radios.
Even if this CD didn’t exist, it’d be easy to create one. Just ask someone in San Francisco to record 144.390MHz for a bit. It was basically fully saturated with various beacons, when I was last there.
For 9600, there’s nothing. Nothing that I could find, at least.
WB2OSZ has a great doc on demodulating 9600bps, where he seems to have started a recording at home, and then driven around beaconing data.
So let’s do that.
- Step 1: Record data.
- Step 2: compare and iterate on improving implementations.
First I generated some packets, using my ax25ms project:
mkdir data for i in $(seq -w 10000); do ~/ax25ms/generate M0THC-1 aprs msg M0THC-2 "Decode test 2023-11-18 seq $i" > data/$i.bin done mkdir kiss for f in data/*; do ./kiss_encode.py < "$f" > "kiss/$(basename "$f")" done
Another detour: The
kiss_encode.py script was, except for
the last line, all written by GPT-4. Not like I can’t write it, or
haven’t written it before. But you know what’s faster than looking up
the constants and writing it? Not doing it.
Now I have the data ready to send to a modem that takes packets using KISS. As you may guess by now, I used the D74.
To send the packets I put a raspberry pi zero and a battery pack in my backpack, and sent packets to the radio via bluetooth.
Yet another detour: when in the field, the easiest way to SSH to a raspberry pi is to SSH over bluetooth. I have it set up on all my raspberry pies.
To connect to the radio and send packets I started two loops, in case bluetooth decided to be a problem:
# Keep reconnecting if it disconnects. while true; do sudo rfcomm connect /dev/rfcomm0 24:71:89:XX:XX:XX 2 sleep 2 done
# Send packets every 10 seconds. for i in kiss/*.bin; do cat "$i" > /dev/rfcomm0 sleep 10 done
I arbitrarily chose 144.390Mhz to send on, because I didn’t want everyone on 144.800Mhz to interfere with my test. I also made sure to put the volume up, so I could hear anyone asking me to stop.
Then, because I don’t live in a car dependent dystopia, I took a walk.
I set up a simple GNU Radio receiver using an USRP B200 connected to an Diamond VX30 on my roof, filtering/downsampling it down to 50ksps.
With SDRs, remember to avoid tuning exactly onto the signal you’re interested in. The B200 is very good in this regard, but it’s still better to tune a bit off frequency, and then frequency translate in software.
Did I mention Bluetooth sucks?
The bluetooth randomly disconnected sometimes. And the radio sometimes rebooted. And sometimes the radio appeared fine, but ignored all requests for transmissions.
So while walking around I had to keep an eye on the radio, making sure the TX lamp lit approximately every 10 seconds. If it didn’t, then I restarted the radio and waited for it to start working again.
A perfect recording would have a packet exactly every 10 seconds, to make it easier to zoom in where there should be a packet, and see why it’s not decoding. Maybe another time…
I created two big captures. One 51 minutes, one 38 minutes. The latter suffers less from packets failing to send (since I’d spotted the problem), so it’s the more interesting one.
I don’t know how many packets are actually decodable. Since i tried to send every 10s it’s at least no more than 306 and 233, respectively. But because of the aformentioned problems, it’s likely much less.
$ ls -l aprs_capture_test*c32 -rw-r--r-- 1 thomas thomas 1225467648 Nov 18 15:30 aprs_capture_test1.c32 -rw-r--r-- 1 thomas thomas 932427072 Nov 18 16:24 aprs_capture_test2.c32 $ sha1sum aprs_capture_test* 315b15a97e7d7a63205cd840802b84bc7fda07f6 aprs_capture_test1.c32 ba5f6526cdfe67938b4c000b415fc0ef32bf02a7 aprs_capture_test2.c32
If you want the data, I created a torrent for it.
I tested four decoders, two of which I’d written myself.
- GNU Radio with grsat
- Direwolf, the state of the art for these modems.
- My streamed implementation, which has a very primitive clock recovery.
- My Whole Packet Clock Recovery (WPCR) implementation. Check out this video on WPCR for an explanation. It’s pretty amazing.
This is the performance of the four implementations, both in terms of CPU usage, and ability to decode.
These results are true as of commit 339affb506a96e9633bb28349d166b198ff72223 of RustRadio, and Direwolf dev at commit 2260df15a554131b3c24209a7ed17ed509009fec.
Sorted from best to worst decoder, ignoring CPU performance.
|Method||File1||File1 CPU time||File2||File 2 CPU time|
|Direwolf -F 1 -P +||78||2m40s||174||2m0s|
|Direwolf -F 1||77||24.6s||169||18.5s|
|WPCR threshold 0.000001||73||6.2s||134||5.4s|
|GNU Radio with grsat||39||50s||84||37s|
WPCR is surprisingly good! Its main downside is that it requires a
magic value; the burst power threshold. But because this is software,
we could just run many decoders, each with a different
threshold. After all, running multiple decoders at once is what
Direwolf does with
I’m sure the main problem with
Streamed is the clock recovery. It
needs to be WAY smarter. I want to write something like the one
described by Andy Walls.
But now I have something to iterate on!
$ time ./grsat_decode.py > decoded1 [… wait for CPU to die down, then press enter …] real 1m21.621s user 0m47.201s sys 0m2.479s $ grep -c '0000: 82 a0 b4 60 60 62 60 9a 60 a8 90 86 40 e3 03 f0' decoded1 39
Most of the time goes to juggling PDUs. The initial FftFilter helps marginally (adds two decodes for file2).
$ cargo build \ -F fast-math \ --release \ --example ax25-9600-rx \ --example ax25-9600-wpcr $ time ./target/release/examples/ax25-9600-wpcr \ -o packets \ -r aprs_capture_test1.c32 \ -v 2 \ --sample_rate 50000 \ --threshold 0.000001 […] Block name Seconds Percent ------------------------------------------ FileSource 0.510 8.26% FftFilter 1.060 17.15% RationalResampler 0.278 4.49% Tee 0.084 1.37% ComplexToMag2 0.048 0.78% SinglePoleIIRFilter<T> 0.313 5.06% QuadratureDemod 0.398 6.43% Burst Tagger 0.467 7.56% StreamToPdu 2.807 45.42% Midpointer 0.070 1.13% WPCR 0.087 1.41% VecToStream 0.002 0.03% BinarySlicer 0.002 0.03% NrziDecode 0.001 0.02% Descrambler 0.001 0.02% HDLC Deframer 0.004 0.06% PDU Writer 0.048 0.77% ------------------------------------------ All blocks 6.181 99.96% Non-block time 0.002 0.04% Elapsed seconds 6.183 100.00% 2023-11-18T19:23:17+00:00 - INFO - HDLC Deframer: Decoded 75 (incl 0 bitfixes), CRC error 1 2023-11-18T19:23:17+00:00 - INFO - PDU Writer: wrote 75 real 0m6.188s user 0m5.837s sys 0m0.312s
(then some manual confirmation of correct decode, which is why it’s 73 and not 75 in the summary. Similar below for 33 instead of 42)
$ time ./target/release/examples/ax25-9600-rx \ -o packets \ -r aprs_capture_test1.c32 \ -v 2 \ --sample_rate 50000 […] Block name Seconds Percent ------------------------------------- FileSource 0.545 14.10% FftFilter 1.044 27.02% RationalResampler 0.278 7.20% QuadratureDemod 0.400 10.36% ZeroCrossing 1.153 29.84% BinarySlicer 0.008 0.21% NrziDecode 0.004 0.11% Descrambler 0.082 2.12% HDLC Deframer 0.337 8.72% PDU Writer 0.012 0.31% ------------------------------------- All blocks 3.865 99.97% Non-block time 0.001 0.03% Elapsed seconds 3.866 100.00% 2023-11-18T19:28:10+00:00 - INFO - HDLC Deframer: Decoded 42 (incl 9 bitfixes), CRC error 4892 2023-11-18T19:28:10+00:00 - INFO - PDU Writer: wrote 42 real 0m3.869s user 0m3.626s sys 0m0.236s
First the input needs to be converted to wav, using
Then I used
atest from Direwolf:
$ src/atest -F 1 -B 9600 aprs_capture_test2.wav […] 169 packets decoded in 18.511 seconds. 125.9 x realtime
-F 1 means try to fix one bit error.
-P + means try multiple
decoders, per this doc.