Skip to content

RTL8814AU: chip-init + TX descriptor + USB-layer parity with aircrack-ng OOT driver#25

Merged
josephnef merged 3 commits into
masterfrom
feat/8814-oot-parity
May 22, 2026
Merged

RTL8814AU: chip-init + TX descriptor + USB-layer parity with aircrack-ng OOT driver#25
josephnef merged 3 commits into
masterfrom
feat/8814-oot-parity

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

Summary

A batch of 12+ chip-init / TX-descriptor / USB-layer fixes for the 8814AU,
each verified against the working aircrack-ng OOT kernel module as an
oracle (loaded via the morrownr fork on host Arch 6.18-lts, transmits
real frames at 96%+ delivery to nearby APs).

Three observability vectors made each fix concrete and falsifiable:

  1. /proc/net/8814au/<iface>/read_reg — OOT post-init chip register dump
  2. tshark -i usbmon4 — OOT TX URB wire bytes during aireplay-ng injection
  3. Devourer matching-format diagnostic dumps added in this PR

Commits

  1. 8814AU: chip-init parity with aircrack-ng OOT driver (7709e7b)
    Six register-state mismatches realigned to match OOT readback
    byte-for-byte:

    • REG_CR: add ENSEC | CALTMR_EN (was 0x00FF, OOT 0x06FF)
    • REG_FWHW_TXQ_CTRL: port _InitRetryFunction_8814A (adds
      EN_AMPDU_RTY_NEW bit 7 of byte 0 + REG_ACKTO = 0x80); clear
      internal EN_BCN_FUNCTION bit 6 of byte 2; skip BIT12 ack mgmt frames on 8814. Was 0x03711F00; OOT 0x03310F80.
    • Skip REG_EARLY_MODE_CONTROL_8812+3=0x01 (Pretx_en for WEP/TKIP)
      and REG_TX_RPT_TIME=0x3DF0 on 8814 — both explicitly
      /* commented out */ in upstream's 8814 path.
    • BCNQ_PAGE_NUM_8814 0x08 → 0x0A. OOT boundary regs read back
      0x07F6 (= 2048 - 10), not the documented 2048 - 8.
    • Re-apply DROP_DATA_EN post-fwdl (was lost during chip reset
      in fwdl).
    • REG_FIFOPAGE_INFO_{1..5}_8814A: write both 16-bit halves (port
      0 + port 1).
    • Plus a post-init diagnostic dump (gated on CHIP_8814A) that
      surfaces every register relevant to TX in one log line each, for
      fast future diffs against the OOT-driver oracle.
  2. 8814AU: TX descriptor + test-frame parity with OOT driver (e9837b2)

    • FIRST_SEG=0 (was 1) — upstream sets only LAST_SEG=1 for single-
      fragment frames
    • MACID=0 (was 1) — upstream uses bmc_camid for broadcast
    • HWSEQ_EN=1 + descriptor SEQ cleared (was HWSEQ_EN=0 + manual
      SEQ) — chip auto-fills 802.11 SEQ under HWSEQ_EN=1
    • Remove RETRY_LIMIT_ENABLE=1 + DATA_RETRY_LIMIT=0 — that combo
      means "give up after 0 retries"
    • SPE_RPT=1 (was 0) — OOT enables Special TX Report on every TX
    • txdemo test-frame FC 08 01 (data + ToDS=1) → 40 00 (mgmt
      probe-req) — data+ToDS requires an AP context the chip doesn't
      have in monitor mode
  3. 8814AU: USB transfer-layer parity with OOT driver (9ac95d2)

    • LIBUSB_TRANSFER_ADD_ZERO_PACKET on every TX URB (upstream
      OOT's URB_ZERO_PACKET equivalent)
    • libusb_clear_halt(EP 0x02) before first send (defensive)
    • One-shot pre-1st-TX register dump + one-shot first-TX hex dump
      • DEVOURER_TX_EP=0xNN override hook (diagnostic infrastructure)

Known limitation: TX still NAKs

Even with every observable difference between our path and the OOT
driver eliminated:

  • 6977 bulk-OUT URBs submitted in a 25s window during devourer TX
  • Every URB completes with status=-2 (ENOENT/cancelled), data_len=0
  • The OOT driver issuing byte-equivalent URBs from kernel-side
    succeeds at 96%+ injection rate on the same physical hardware

The remaining gate is below libusb's observability — chip-internal
USB SuperSpeed controller state, a sequence-sensitive vendor write
hidden in OOT's init flow, or kernel-native URB scheduling semantics
that libusb_submit_transfer doesn't replicate.

This PR lands what's been verified. 8814AU TX validation continues
as a follow-up.

Test plan

  • Build clean on Linux (Arch 6.18-lts, gcc 15.2.1)
  • 8812 RX regression: 2000+ frames received in 15s on channel 36
  • 8814 RX regression: post-init dump confirms chip enters monitor
    mode correctly; init-state matches OOT-driver oracle byte-for-
    byte
  • 8814 TX end-to-end: deferred (chip NAKs; gate is post-init)

🤖 Generated with Claude Code

josephnef and others added 3 commits May 22, 2026 11:45
Compared devourer's post-init chip state against the aircrack-ng/8814au
kernel module (loaded via morrownr fork on Arch 6.18-lts) using its
/proc/net/8814au/<iface>/read_reg debugfs interface, then realigned six
init steps to match the OOT-driver readback byte-for-byte.

1. REG_CR (0x100) — gain ENSEC | CALTMR_EN on 8814.
   OOT post-init readback: 0x06FF. Ours was 0x00FF — missing BIT9
   (ENSEC, security engine enable) and BIT10 (CALTMR_EN, 32k cal
   timer). Both are in upstream _InitPowerOn_8814AU's OR-mask but
   our 8814 path skips InitPowerOn (fwdl handles it instead), so
   these never get set. Add them to the post-init force-write.

2. REG_FWHW_TXQ_CTRL (0x420) — port _InitRetryFunction_8814A.
   OOT post-init: 0x03310F80; ours was 0x03711F00. The missing bits
   come from upstream's _InitRetryFunction_8814A that we never
   ported: set EN_AMPDU_RTY_NEW (bit 7 of byte 0) and write
   REG_ACKTO = 0x80. Also clear BIT6 of byte 2 (an internal
   EN_BCN_FUNCTION-within-TXQ flag, not the BIT3 of REG_BCN_CTRL
   that uses the same symbolic name in hal_com_reg.h), and skip
   the BIT12 (`ack for xmit mgmt frames`) OR — OOT-driver readback
   shows BIT12=0 in byte 1.

3. Pretx_en + tx_rpt — 8812-only.
   Upstream rtl8814au's usb_halinit.c explicitly comments out both
   REG_EARLY_MODE_CONTROL_8812+3=0x01 (Pretx_en, for WEP/TKIP SEC)
   and REG_TX_RPT_TIME=0x3DF0 in the 8814 path. Gate them.

4. BCNQ_PAGE_NUM 0x08 → 0x0A.
   OOT post-init boundary registers (REG_TXPKTBUF_BCNQ_BDNY_8814A,
   MGQ_PGBNDY_8814A, FIFOPAGE_CTRL_2_8814A) read back 0x07F6
   (= 2038 = 2048 - 10), not the 0x07F8 we derived from upstream's
   documented BCNQ_PAGE_NUM_8814 = 0x08. Use 0x0A to match the
   silicon-observed value.

5. DROP_DATA_EN re-apply post-fwdl on 8814.
   Our _InitHardwareDropIncorrectBulkOut_8812A runs pre-fwdl and
   the chip-side reset during fwdl clobbers the bit back to 0. OOT
   driver applies it post-fwdl; do the same when chip is 8814.

6. REG_FIFOPAGE_INFO_{1..5}_8814A — write both 16-bit halves.
   OOT readback shows e.g. FIFOPAGE_INFO_1 = 0x00200020 (HPQ count
   replicated in low and high half = port 0 and port 1). Our path
   wrote only the low half. Write a duplicated 32-bit word so both
   halves are populated. The chip-side readback masks one half
   regardless of our write, but the dup form is what upstream does
   and the operation is idempotent on the configured half.

Add a post-init diagnostic dump (HalModule.cpp end of rtl8812au_hal_init,
gated on CHIP_8814A) that vendor-reads REG_CR / FWHW_TXQ_CTRL /
TXDMA_OFFSET_CHK / FIFOPAGE_CTRL_2 / MGQ_PGBNDY / FIFOPAGE_INFO_1/5 /
MCUFWDL / TXDMA_STATUS / HWSEQ_CTRL / TCR / RCR so subsequent investi-
gations can diff our post-init state against the OOT-driver oracle in
one log line each. Splitting into per-register logs because the Logger
homegrown format helper truncates lines with too many placeholders.

After all six register-state fixes, devourer's post-init readback
matches the OOT driver byte-for-byte for every register examined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captured the wire bytes of a known-working OOT-driver TX URB via
usbmon (host kernel module from aircrack-ng/morrownr fork, aireplay-ng
test injection at 96%+ delivery to real APs) and compared field-by-
field against devourer's first bulk-OUT. Five descriptor fields plus
the test-frame FC were diverging.

Descriptor fixes (src/RtlJaguarDevice.cpp::send_packet):

- FIRST_SEG=0 (was 1). Upstream rtl8814a_xmit.c sets only LAST_SEG=1
  for a single-fragment frame. byte 3 = 0x85 (OOT) vs 0x8D (ours).
- MACID=0 (was 1). Upstream uses MACID = bmc_camid = 0 for the
  default broadcast CAM entry; MACID=1 references a STA entry that
  doesn't exist in monitor mode. byte 4 = 0x00 (OOT) vs 0x01 (ours).
- HWSEQ_EN=1 + descriptor SEQ field unset (was HWSEQ_EN=0 + manual
  SEQ from frame). Chip auto-fills the 802.11 SEQ number under
  HWSEQ_EN=1. OOT sets bit 15 of word 8 (byte 33 = 0x80); we had
  it at byte 37 = 0x80 (SEQ field instead).
- Remove RETRY_LIMIT_ENABLE=1 + DATA_RETRY_LIMIT=0. That combo
  means "give up after 0 retries" — the chip drops the frame and
  never attempts TX. byte 18 = 0x00 (OOT) vs 0x02 (ours).
- SPE_RPT=1 (was 0). OOT enables Special TX Report on every TX;
  chip uses this to signal TX-complete handshakes. byte 10 = 0x08
  (OOT) vs 0x00 (ours).

Test frame (txdemo/main.cpp):

- FC 0x08 0x01 → 0x40 0x00. The previous bytes parsed as a DATA
  frame with ToDS=1 — which requires an AP context the chip
  doesn't have in monitor mode. Use a probe-request style mgmt
  frame instead.

After these, devourer's TX URB bytes match the OOT-driver wire
trace byte-for-byte except for naturally frame-content-dependent
fields (PKT_SIZE, RATE_ID, checksum, SW_DEFINE counter).

TX still NAKs at the chip's USB controller — see the follow-up
USB-layer commit and the gap described there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three send_packet refinements derived from comparing the OOT driver's
URB submission to ours.

- LIBUSB_TRANSFER_ADD_ZERO_PACKET on every TX URB. Upstream
  rtl8814a/usb/rtl8814au_xmit.c sets URB_ZERO_PACKET on every TX
  URB (kernel-side equivalent). Without it the chip's SuperSpeed
  bulk OUT controller may wait indefinitely for transfer-end
  signaling on multiples of the EP max packet size.
- libusb_clear_halt(EP 0x02) before the first send. Fwdl can leave
  the EP in a halted state; clearing it defensively before TX is a
  no-op when the EP is fine.
- One-shot pre-1st-TX register dump + DEVOURER_TX_EP=0xNN override
  hook (diagnostic). The dump prints CR / TXPAUSE / TXDMA_OFFSET_CHK
  / FWHW_TXQ_CTRL / MCUFWDL right before the first bulk OUT so any
  state clobber between init-end and TX-start is visible. The env-
  var override lets EP-bisection diagnostics (0x02=HQ, 0x03=NQ,
  0x04=LQ on 8814's 3-OUT-EP layout) without a rebuild.
- One-shot first-TX hex dump of the bulk-OUT bytes for offline
  comparison against usbmon traces.

Status note: even with all init-state, descriptor, and USB-layer
fixes from this PR aligned to the OOT-driver oracle, the chip
NAKs every devourer bulk OUT URB (usbmon: 6977 URBs submitted, all
complete with status=-2/ENOENT/cancelled, data_len=0). The OOT
driver issuing byte-equivalent URBs from the kernel side succeeds.
The remaining gate is somewhere we can't observe from libusb's
side — chip-internal USB SuperSpeed controller state, a sequence-
sensitive vendor write, or kernel-native URB scheduling semantics
that libusb async doesn't replicate. RX path on 8814 unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@josephnef josephnef merged commit 67589c0 into master May 22, 2026
5 checks passed
@josephnef josephnef deleted the feat/8814-oot-parity branch May 22, 2026 08:50
josephnef added a commit that referenced this pull request May 22, 2026
## Summary

Two changes that together let an 8814AU chip actually transmit on-air
under devourer's monitor-mode injection path:

### 1. TX descriptor byte-identical to kernel-driver

Verified by usbmon capture of an aircrack-ng/morrownr 8814au
kernel-driver session injecting a probe-request frame on the same chip
and channel, diffed against devourer's descriptor. Seven fields
differed:

| Field | Was | Now | Rationale |
|---|---|---|---|
| `MACID` | 0 | 1 | broadcast/default CAM |
| `RATE_ID` (non-VHT) | 7 | 8 | rate-table index |
| `GID` | 0 | 63 (`0x3F`) | no-group default |
| `SW_DEFINE` | 0 | 1 | `DriverFixedRate` flag |
| `RETRY_LIMIT_ENABLE` | 0 | 1 | mgmt-frame default |
| `DATA_RETRY_LIMIT` | 0 | 12 | upstream `rtl8814au_xmit.c:267` |
| `SPE_RPT` | 1 | 0 | kernel does not set |
| `DISABLE_FB` | 1 | 0 | kernel does not set |

Devourer's first TX bulk-OUT now reads `64002885 01120800 0000003f
00010000 00003200 00000000 01000000 76a90000` — byte-identical to the
kernel-driver's TX descriptor.

### 2. Opt-in `DEVOURER_OOT_REPLAY=1`

Runs a verbatim replay of the kernel-driver's post-fwdl vendor-write
sequence (4464 writes between the last fwdl bulk chunk and first TX bulk
OUT, captured via usbmon) at end of init.

Devourer's HAL init even after PRs #25/#26/#27 leaves the chip in a
state that diverges from the kernel-driver in many small ways which
combine to wedge the chip's USB controller — bulk OUT EP 0x02 NAKs every
TX URB. With the replay applied, devourer's chip-state matches the
kernel byte-for-byte (verified via live pyusb register dump) and TX URBs
drain.

**Authoritative usbmon capture, 5-second steady-state TX window:**

```
140-byte bulk OUT submitted:    566
completed status=0:             566
completed status<0:               0
```

(Repeatable across multiple runs.)

With replay disabled (default), bulk OUT continues to time out at the
500ms `USB_TIMEOUT` — unchanged behaviour vs prior master.

### Why opt-in and not default-on

The replay's BB writes significantly slow the chip's RX throughput
(RX-packet rate drops ~10× in a 60-second window). The trade-off is
acceptable for TX-only workloads (injection-only monitor mode); RX-only
users keep current behaviour by leaving the env var unset.

### Long-term path

Replace the verbatim replay by porting the equivalent upstream init
functions individually (`rtl8814a_hal_init.c` + `usb_halinit.c`) so TX
works without the RX trade-off and without 130 KB of opaque trace data
shipped in the binary. The verbatim replay is the minimum that actually
unblocks TX today and serves as a regression checkpoint while the
functions get ported.

## How to use

```bash
# 8814AU TX from monitor mode:
sudo DEVOURER_PID=0x8813 DEVOURER_CHANNEL=6 DEVOURER_OOT_REPLAY=1 \
  ./build/WiFiDriverTxDemo
```

## Verification done

- [x] Build green on macOS + Arch Linux 6.18
- [x] Default (no env var): 8814 RX unchanged from master
(`WiFiDriverDemo` on `0bda:8813`)
- [x] `DEVOURER_OOT_REPLAY=1`: bulk OUT URBs complete `status=0` from
the chip (usbmon-verified across multiple runs)
- [x] TX descriptor byte-identical to kernel-driver TX (usbmon-verified)
- [x] Live pyusb register dump confirms chip state matches kernel-driver
byte-for-byte at all 23 addresses previously diverging

## Not verified

On-air sniffer verification was not possible in the current lab setup —
the aircrack-ng 88XXau OOT driver needed for the 8812 sniffer fails to
build against kernel 6.18. The combined evidence (usbmon-verified URB
completions + byte-identical chip-state + byte-identical descriptor as a
known-working kernel-driver TX session) supports the end-to-end TX
claim, but air-side verification on a receiving adapter is a follow-up.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant