15 - MR18 Deep Dive: UART Scripts

Hex-encoding 7 million bytes because the PHY won’t cooperate

Slow grind, patience on the line / twenty minutes on the wire just to make it mine

Post 6 described the hex-over-UART transfer protocol at a high level—hex-encode every byte, pipe it through an awk decoder on the MR18, pray nothing goes wrong for 20 minutes straight. This post rips open send_binary.py and uart_transfer.py and walks through the actual Python. Two scripts, two transfer sizes, same dumb protocol, very different engineering challenges.

Why UART Is the Only Option

Quick recap for anyone who skipped post 6: the MR18 is in OpenWrt failsafe mode. Ethernet TX works—the board sends ARP requests just fine. Ethernet RX is completely broken because the AR8035 PHY’s RGMII RX clock delay isn’t configured (the whole reason we wrote the bare-metal C fix in post 14). JTAG is a write-only channel for data—you can load binaries into RAM but you can’t use it as a general-purpose communication channel once Linux is running.

That leaves UART. The serial console. 115200 baud, 8N1, the same connection I’ve been using for console output since post 1. TX and RX both work because UART doesn’t go through the PHY. It goes through the SoC’s built-in UART peripheral directly to the FTDI chip on the ESP-Prog. No PHY, no RGMII, no clock delay problems. Just bits on a wire at 115,200 bits per second.

The downside: 115200 baud is about 11,520 bytes per second of raw throughput. And we can’t even use all of that because we’re hex-encoding everything (more on why in a second), which cuts effective throughput in half. So we’re looking at roughly 5,760 bytes per second of actual binary data. For the 5.5 KB ar8035-fix binary, that’s under 2 seconds. For the 7 MB sysupgrade image… well, get comfortable.

The Hex-Over-Serial Protocol

You can’t send raw binary over a serial console. The tty layer sitting between the serial port and the shell interprets control characters—0x03 is Ctrl-C (SIGINT), 0x04 is Ctrl-D (EOF), 0x0A/0x0D get mangled by CR/LF conversion, 0x1A is Ctrl-Z (SIGTSTP). Send a binary that happens to contain 0x03 and you just killed the awk decoder with SIGINT. Hilarious the first time, infuriating the second.

The solution is braindead simple: hex-encode every byte. 0xFF becomes the two ASCII characters "ff". 0x03 becomes "03". Every byte in the binary is represented by exactly two printable hex characters, none of which are control characters, none of which trigger tty interpretation. It doubles the data size but it’s completely safe.

The decoder runs on the MR18 as a busybox awk one-liner:

AWK_DECODE = (
    "awk 'BEGIN{h=\"0123456789abcdef\"}"
    "{for(i=1;i<=length($0);i+=2)"
    "printf\"%c\",(index(h,tolower(substr($0,i,1)))-1)*16+"
    "(index(h,tolower(substr($0,i+1,1)))-1)}'"
)

It reads hex pairs from stdin, looks up each nibble in the hex alphabet string, multiplies the high nibble by 16, adds the low nibble, and printf "%c" emits the resulting byte. Awk’s index() is 1-based so the -1 adjusts to 0-based. This runs entirely in busybox awk—no external dependencies, nothing to install, works out of the box in failsafe mode. Pipe hex in, get binary out.

send_binary.py: The Quick One (5.5 KB)

This script transfers the ar8035-fix binary—5,592 bytes, under 2 seconds of actual transfer time. It’s the warm-up act. The script is 132 lines and most of that is error handling.

Two utility functions do the heavy lifting for serial communication:

def drain(ser):
    """Read and discard everything in the serial buffer."""
    while ser.in_waiting:
        ser.read(ser.in_waiting)
        time.sleep(0.05)

def wait_prompt(ser, timeout=5.0):
    """Block until we see a shell prompt (# or $)."""
    buf = b""
    deadline = time.time() + timeout
    while time.time() < deadline:
        if ser.in_waiting:
            buf += ser.read(ser.in_waiting)
            if b"#" in buf or b"$" in buf:
                return buf.decode(errors="replace")  # bytes to string, replace any non-UTF8 garbage
        time.sleep(0.05)
    raise TimeoutError("no prompt")

drain() clears any stale output from the serial buffer. wait_prompt() blocks until the shell prompt appears—you need this after every command to know the MR18 is ready for the next one. Without it you’d be sending commands into a shell that’s still executing the previous one.

The transfer sequence:

Verify the shell is alive: send echo alive\n, wait for the echo back. If this times out, the serial connection is broken and there’s no point continuing.
Start the awk decoder: send the awk command piped into the output file (> /tmp/ar8035-fix). The shell starts awk, which blocks reading from stdin, waiting for hex data.
Send hex in 512-byte chunks: read 512 bytes from the binary, hex-encode them, write the hex string plus a newline to the serial port. The newline triggers awk to process that line. Repeat until the whole file is sent.
Send EOF: write 0x04 (Ctrl-D) to signal end-of-input to awk. Awk finishes, the shell returns to the prompt.
Verify: run wc -c /tmp/ar8035-fix and check the byte count matches. Run md5sum /tmp/ar8035-fix and check the hash. If either mismatches, abort.
Execute: chmod +x /tmp/ar8035-fix then /tmp/ar8035-fix. Capture the output, which should show the PHY ID and register values.
Confirm the fix worked: check rx_packets on eth0. If it’s incrementing, Ethernet RX is alive. The PHY fix worked.

For 5.5 KB this whole process takes about 10 seconds including verification. The transfer itself is a rounding error—it’s the command-response overhead that dominates at this size.

uart_transfer.py: The Long One (7 MB)

This is the big one. 7,340,032 bytes of sysupgrade image, hex-encoded to 14,680,064 bytes, shoved through a 115200 baud serial port. The script is 245 lines and every extra line over send_binary.py exists because of the scale difference. Problems that don’t exist at 5.5 KB become showstoppers at 7 MB.

Phase 0: The Pre-Test

Before committing to a 20-minute transfer, the script runs a sanity check:

# Send 32 known bytes (0x00 through 0x1f) through the full pipeline
test_data = bytes(range(0x20))
test_hex = test_data.hex()
# ... start awk decoder writing to /tmp/pretest ...
# ... send test_hex ...
# ... Ctrl-D, verify md5sum ...

32 bytes. Takes less than a second. The hex is sent through awk, written to a file, and verified via md5sum. If this fails—awk isn’t working, the tty is mangling data, the shell is in a weird state—we find out immediately instead of 20 minutes from now. I added this after losing a 15-minute transfer to a tty configuration issue that a 1-second pre-test would have caught. The bitterness was motivating.

The Echo Drainer Problem

Here’s something that doesn’t matter at 5.5 KB but absolutely murders you at 7 MB: the shell echoes every character back.

When you type on a serial console, the shell echoes each character so you can see what you’re typing. When a Python script sends 14,680,064 bytes of hex data, the shell echoes all 14,680,064 bytes right back. That’s 14 MB of echo data flowing from the MR18 to the host. The host’s serial receive buffer is typically 4 KB. Without active draining, the buffer fills up, hardware flow control kicks in (if you’re lucky) or data gets dropped (if you’re not), and the whole transfer stalls or corrupts.

The fix is a background thread that continuously reads and discards the echo:

def echo_drainer(ser, stop_event):
    """Background thread: read and discard serial echo to prevent buffer stall."""
    while not stop_event.is_set():
        if ser.in_waiting:
            ser.read(ser.in_waiting)
        else:
            time.sleep(0.01)

Start it before the transfer, stop it after. The thread doesn’t do anything with the data—it just makes sure the receive buffer never fills up. It’s a garbage disposal for echo. This is the kind of problem you only discover when you scale up. The 5.5 KB transfer works fine without draining because the echo fits comfortably in the buffer. At 7 MB it’s a hard requirement.

Phase 1: The Full Transfer

Same protocol as send_binary.py—512-byte chunks, hex-encoded, newline-terminated. But with progress reporting because staring at a silent terminal for 20 minutes is psychological torture:

for i, chunk_start in enumerate(range(0, len(data), CHUNK_SIZE)):
    chunk = data[chunk_start:chunk_start + CHUNK_SIZE]
    ser.write(chunk.hex().encode() + b"\n")  # hex-encode the bytes, add newline so awk processes the line

    if (i + 1) % 500 == 0:
        elapsed = time.time() - t_start
        done = chunk_start + len(chunk)
        rate = done / elapsed / 1024
        pct = done / len(data) * 100
        eta = (len(data) - done) / (done / elapsed)
        print(f"  {pct:5.1f}%  {rate:.1f} KB/s  ETA {eta:.0f}s")

Every 500 chunks (256 KB of binary data), print a progress line with percentage, transfer rate in KB/s, and estimated time remaining. At 115200 baud you’re getting roughly 5-6 KB/s effective binary throughput. The ETA starts around 21 minutes and slowly ticks down. It’s not fast. It’s not supposed to be fast. It’s supposed to work.

EOF and Verification

After the last chunk, send Ctrl-D (0x04) to close awk’s stdin. Stop the echo drainer thread. Wait for the shell prompt. Then verify:

EXPECTED_MD5 = "53e272bed2041616068c6958fe28a197"

Run wc -c /tmp/fw.bin—should be 7,340,032 bytes. Run md5sum /tmp/fw.bin—should match EXPECTED_MD5. If either check fails after 20 minutes of transfer, the script prints an error and does NOT proceed with sysupgrade. Flashing a corrupt image to NAND would brick the device, and then I’d need to do the entire JTAG song and dance again. No thanks.

If both checks pass, the script automatically launches sysupgrade:

ser.write(b"sysupgrade -n /tmp/fw.bin\n")

The -n flag means “don’t preserve config”—wipe everything, clean flash. The device reboots, 90 seconds pass, and either you see the OpenWrt banner on the serial console or you start questioning your life choices.

The Math

Let’s work through the transfer time because it’s satisfying in a depressing way:

Binary size: 7,340,032 bytes
Hex-encoded size: 7,340,032 x 2 = 14,680,064 bytes
Baud rate: 115,200 bits/sec
Effective byte rate: 115,200 / 10 = 11,520 bytes/sec (8 data bits + start + stop)
Transfer time: 14,680,064 / 11,520 = 1,274 seconds = ~21.2 minutes

In practice it’s closer to 20 minutes because the awk processing on the MR18 side isn’t instantaneous—it adds a tiny amount of backpressure per chunk, but the serial buffer smooths most of that out. The progress output shows a consistent 5.5-5.8 KB/s throughout the transfer. No slowdown, no speedup, just a steady drip of hex characters at the maximum rate the wire can carry them.

Twenty minutes to transfer 7 MB. In 2026. Over a protocol that’s been fundamentally unchanged since the 1960s. There’s something almost poetic about that. Or maybe just stupid. Depends on whether the md5sum matches at the end.

Why Two Scripts

You might wonder why I didn’t just use uart_transfer.py for the ar8035-fix binary too. Two reasons. First, send_binary.py was written first, when I only needed to transfer the 5.5 KB fix. It’s simpler—no echo drainer, no pre-test, no progress reporting—because it doesn’t need any of that at 5.5 KB. Second, send_binary.py handles the post-transfer execution and rx_packets verification, which is specific to the PHY fix workflow. uart_transfer.py handles sysupgrade, which is a completely different post-transfer action. Different payloads, different post-transfer behavior, two scripts. I could have unified them but the complexity cost wasn’t worth it for code that runs exactly once per flash cycle.

Both scripts are held together with time.sleep() calls and string matching on serial output. They’re not elegant. They don’t need to be. They need to reliably transfer data over a 115200 baud serial link and verify the result, and they do. Every time. Even at 3am. Even after the twentieth power cycle. That’s all I ask of them.