feat(fleetnode): discovery scan + ReportDiscoveredDevices agent by ankitgoswami · Pull Request #256 · block/proto-fleet

ankitgoswami · 2026-05-18T22:55:58Z

Summary

Adds the on-prem fleet-node agent's control-stream consumer (RFC 0001 phase 2). After enrollment is CONFIRMED, the agent opens ControlStream, executes the pairing.v1.DiscoverRequest the server dispatches, streams results via ReportDiscoveredDevices correlated by command_id, and acks.

Pairs with #235 (server-side ControlStream handler, DiscoverOnFleetNode admin RPC, in-memory registry). Wire contracts merged in #257. Until #235 lands, the agent's control loop receives Unimplemented and reconnects with backoff — degrades quietly. PR order between #235 and this one doesn't matter.

Agent invocation

fleetnode run — no scan flags needed. Plugins resolved from <exe-dir>/plugins; nmap from <exe-dir>/nmap or PATH at scan time. State, enrollment, and heartbeat configuration via existing flags.

Control loop (`control.go`)

runControlLoop opens the stream and reconnects with exponential backoff (1s → 30s cap). Permanent errors (CodeNotFound, ErrBeginAuthRejected) propagate so the daemon exits rather than spinning.
runControlSession performs Hello → Accepted handshake, then receives ControlCommands. A watcher goroutine closes the stream on ctx.Done because http2.pipe.Read blocks on a sync.Cond that ctx alone cannot unblock.
handleCommand decodes the payload as pairing.v1.DiscoverRequest and dispatches:
- IPListModeRequest → bounded-concurrency probes via the plugin manager (32 in-flight, 3s per probe).
- NmapModeRequest → runNmapDiscovery shells out via Ullaakut/nmap/v3, then runs plugin probes against every open host:port (16 in-flight). 10-minute overall scan budget.
- MDNSModeRequest → rejected via ack error_message.
streamReports chunks at the server's max_items = 1024 limit and sets command_id so the operator stream wakes up.
Runs alongside the heartbeat loop. sync.Mutex guards SessionToken; refreshAndSave handshakes against a shallow State copy so the 2-RPC refresh does not stall token reads.

Plugins

Loaded from <exe-dir>/plugins (fixed; no CLI flag). Missing dir → heartbeat-only mode. Present-but-unsafe → hard error: the plugin manager execs every file there as the agent uid, so non-owner write capability is RCE-equivalent. Unix checkPluginsDirPerms refuses non-root/non-self owners or any group/world write bits. Windows production deployments must install the binary under an Administrator-only directory (e.g., %ProgramFiles%\fleetnode\) so plugins\ inherits a safe ACL.

Nmap

<exe-dir>/nmap if present, otherwise nmap from PATH at scan time. Server-supplied targets pass through validateNmapTarget, which accepts only IP / CIDR / A.B.C.D-N range / hostname. Leading dashes, whitespace, and shell metacharacters are rejected — a compromised server cannot pivot via flags like -iL/etc/passwd or -oN /tmp/loot.

Transport security

newGatewayHTTPClient is scheme-aware: http:// uses h2c (loopback streaming), https:// uses a TLS-validating http2.Transport. A shared AllowHTTP + DialTLSContext shim would silently downgrade https:// to plaintext, defeating ValidateServerURL's non-loopback HTTPS requirement.

Identifier synthesis

Production SDK drivers leave DeviceIdentifier empty. The agent synthesizes:

mac:<addr> when MAC is exposed
serial:<sn> when only serial is exposed
auto:<uuid> otherwise (server reconciles by (fleet_node, ip, port))

Docs

server/cmd/fleetnode/README.md covers overview, subcommands, state/lock semantics, plugins, nmap, control-stream flow, build, run, enrollment, security model, development, troubleshooting.

Tests

control_test.go: happy path (probe → report → ack with correlated command_id), MDNS rejection ack, reconnect after stream EOF.
nmap_test.go: TestValidateNmapTarget (21 cases covering IP/CIDR/range/hostname acceptance and -iL, whitespace, null-byte, shell-metachar, leading-dash rejection); resolver tests.
plugins_dir_test.go: binary-adjacent default discovery, missing-default-returns-empty, present-but-unsafe hard error.
client_test.go: TLS round-trip with pinned cert succeeds; untrusted cert fails with x509; non-http(s) scheme rejected; redirects refused.
run_test.go: stubGatewayClient satisfies the extended gatewayClient interface (ReportDiscoveredDevices, ControlStream).

Test plan

go build ./server/...
go vet ./server/...
golangci-lint run -c .golangci.yaml ./cmd/fleetnode/... ./internal/fleetnodebootstrap/...
go test ./server/cmd/fleetnode/... ./server/internal/fleetnodebootstrap/... -race -count=1
E2E (post-feat(fleetnode): discovery + pairing E2E (server) #235): fleetnode enroll, confirm, then fleetnode run. Verify control stream opened log. With the fake antminer on :4028, hit DiscoverOnFleetNode from the operator side with an ipList request and confirm streaming DiscoverResponse containing the device + a discovered_device row with discovered_by_fleet_node_id set. Re-run with an nmap request and confirm the same.

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e357c802ad

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-18T22:58:21Z

    rpc UploadTelemetry(stream UploadTelemetryRequest) returns (UploadTelemetryResponse);
    rpc UploadEvents(stream UploadEventsRequest) returns (UploadEventsResponse);
    rpc UploadHeartbeat(UploadHeartbeatRequest) returns (UploadHeartbeatResponse);
+    rpc ReportDiscoveredDevices(ReportDiscoveredDevicesRequest) returns (ReportDiscoveredDevicesResponse);


Implement the discovered-device RPC before enabling scans

With this RPC added to the gateway contract, the fleetnode daemon now calls ReportDiscoveredDevices, but the fleetd handler still only embeds UnimplementedFleetNodeGatewayServiceHandler and a repo-wide search found no concrete ReportDiscoveredDevices method outside generated/test code. In any environment running discovery (--scan-cidr with ports/plugins), every scan report will receive CodeUnimplemented from the server and sendDiscoveryChunk logs then drops the batch, so discovered devices are never persisted or pairable.

Useful? React with 👍 / 👎.

implemented in #235

Copilot

Pull request overview

Implements the on-prem fleet-node agent side of RFC 0001 phase 2: a periodic discovery scan that probes user-supplied CIDRs with plugin-loaded discoverers and batches results into the new ReportDiscoveredDevices gateway RPC, running in parallel with the existing heartbeat loop. Adds the matching protos (also duplicated in #235) and exports ipscanner.GenerateIPsFromCIDR so the agent reuses the upstream CIDR enumerator.

Changes:

New fleetnode run flags (--scan-cidr, --scan-port, --discovery-interval) and a discovery goroutine with bounded-concurrency probes, 60s tick timeout, 1024-per-batch chunking, and auto:/mac:/serial: identifier synthesis.
Heartbeat and discovery goroutines share RunCmd.stateMu; refreshAndSave handshakes against a shallow State copy so token reads don't block on the 2-RPC refresh.
ipscanner.generateIPsFromCIDR is exported as GenerateIPsFromCIDR and consumed by the agent's enumerateCIDR, which layers a 4096-host cap on top of the existing 16-host-bit safety cap.

Reviewed changes

Copilot reviewed 10 out of 17 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
proto/fleetnodegateway/v1/fleetnodegateway.proto	Adds `ReportDiscoveredDevices` RPC and `DiscoveredDeviceReport` message with validation.
proto/fleetnodeadmin/v1/fleetnodeadmin.proto	Adds pairing/unpairing/list admin RPCs (duplicated in #235).
server/cmd/fleetnode/run.go	Adds discovery flags, plugin bootstrap, parallel heartbeat/discovery goroutines, mutex-guarded session refresh.
server/cmd/fleetnode/discovery.go	New file: probe orchestration, chunked report sending, identifier synthesis, plugin discoverer wiring.
server/cmd/fleetnode/discovery_test.go	New table-driven and integration tests covering CIDR enumeration, batching, error paths, and parallel heartbeat+discovery.
server/cmd/fleetnode/run_test.go	Extends `stubGatewayClient` with discovery call tracking.
server/cmd/fleetnode/fake_gateway_test.go	Adds `ReportDiscoveredDevices` handler to the fake gateway.
server/internal/domain/ipscanner/network_utils.go	Exports `GenerateIPsFromCIDR` and clarifies its doc comment.
server/internal/domain/ipscanner/scanner.go	Updates internal caller to the exported name.
server/internal/domain/ipscanner/network_utils_test.go	Renames test references to the exported function.
server/generated/grpc/fleetnodegateway/v1/*.go	Regenerated for the new RPC and messages.
server/generated/grpc/fleetnodeadmin/v1/*.go	Regenerated for the new pairing RPCs.
client/src/protoFleet/api/generated/*/.ts	Regenerated TS bindings mirroring the proto additions.
server/sdk/v1/python/proto_fleet_sdk/generated/pb/driver_pb2_grpc.py	Regenerated GRPC version constant update.

github-actions · 2026-05-18T22:58:59Z

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

Reviewed pull request diff only (19a03050f8be5f51375a7b83c1e6f93d9e820873...f7a9656b3efe8f40bf0267355a6fdf6e1f44a7d8, exact PR three-dot diff)

Model: gpt-5.5

💡 Click "edited" above to see previous reviews for this PR.

Review Summary

Overall Risk: HIGH

Findings

[HIGH] Fleetnode Executes Discovery Against Unrestricted Targets

Category: Network Discovery
Location: server/cmd/fleetnode/control.go:168
Description: The new control path accepts server-supplied DiscoverRequest targets, resolves IP-list hostnames, and probes them without enforcing an agent-side network policy. This allows loopback, IPv4 link-local metadata addresses such as 169.254.169.254, public IPs, and arbitrary private networks. The Nmap path similarly accepts any syntactically valid IP/CIDR/hostname, including very large CIDRs, before passing it to Nmap.
Impact: A compromised fleet server, server-side authorization bug, or abused discovery workflow can turn enrolled fleet nodes into SSRF/lateral-scanning pivots inside operator networks, or trigger expensive scans from the agent host.
Recommendation: Enforce target policy on the agent, not only server-side. Resolve hostnames first, then reject loopback, link-local/metadata, multicast, and public ranges by default; cap CIDR host counts; and allow only locally configured/approved subnets unless the operator explicitly opts in.

[MEDIUM] Windows Plugin Loading Skips ACL and File Validation

Category: Plugin | Infrastructure
Location: server/cmd/fleetnode/plugins_dir_windows.go:25
Description: The Unix path rejects unsafe plugin directories/files because the plugin manager executes binaries from <exe-dir>/plugins, but the Windows implementation only checks that the directory exists and makes per-file validation a no-op.
Impact: If a Windows install places plugins\ somewhere writable by non-admin users, an attacker can plant or replace a plugin and get code execution as the fleetnode service account.
Recommendation: Fail closed on Windows unless ACLs can be verified. Use golang.org/x/sys/windows to reject writable ACEs for unprivileged principals, reject reparse points/symlinks, and require an Administrator/SYSTEM/service-owned install path.

Notes

No SQL, frontend rendering, Rust ASIC plugin, pool/wallet, or protobuf wire-format changes were present in the reviewed diff. The changed TLS/h2c client split and redirect rejection look security-positive.

_{Generated by Codex Security Review |

Triggered by: @ankitgoswami |

Review workflow run}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9216c9d33f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-18T23:12:50Z

    rpc UploadTelemetry(stream UploadTelemetryRequest) returns (UploadTelemetryResponse);
    rpc UploadEvents(stream UploadEventsRequest) returns (UploadEventsResponse);
    rpc UploadHeartbeat(UploadHeartbeatRequest) returns (UploadHeartbeatResponse);
+    rpc ReportDiscoveredDevices(ReportDiscoveredDevicesRequest) returns (ReportDiscoveredDevicesResponse);


Add discovery reports to fleet-node auth

This new RPC is called by the fleetnode agent with its session token via NewAuthenticatedGatewayClient, but server/internal/handlers/interceptors/config.go only lists telemetry/events/heartbeat/control in FleetNodeAuthenticatedProcedures. For ReportDiscoveredDevices, FleetNodeAuthInterceptor therefore never resolves the fleet-node session and the normal auth interceptor treats the bearer value as a user API key, so discovery reports are rejected before they can reach the handler; sendDiscoveryChunk then drops Unauthenticated batches and the feature never persists scanned devices.

Useful? React with 👍 / 👎.

Replace the autonomous CIDR-scanning daemon with a control-stream consumer. The agent opens a ControlStream after enrollment is CONFIRMED, sends a Hello, waits for Accepted, then runs each DiscoverRequest the server dispatches as a ControlCommand and streams results back via ReportDiscoveredDevices with the matching command_id. A ControlAck closes each command. - New control.go: runControlLoop with exponential backoff reconnect, runControlSession handshake + receive pump, handleCommand decode + probe + report + ack, runProbes with bounded concurrency, streamReports chunked at server's max, plus the discoverer interface + pluginDiscoverer implementation (moved from the old discovery.go). - Only IPList mode is accepted on the agent. The server expands CIDR/IPRange into IPList before dispatching (see PR B). MDNS and Nmap are rejected with a descriptive ack error. - run.go: drop --scan-cidr/--scan-port/--discovery-interval and the autonomous discovery loop. Add --plugins-dir-driven control loop alongside the heartbeat loop in the same goroutine pattern. Permanent errors (NotFound, ErrBeginAuthRejected) propagate so the daemon exits rather than reconnecting forever. Tests cover happy path (probe + report + ack with correlated command_id), MDNS rejection, and reconnect after stream EOF. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add zero-config defaults so \`fleetnode run\` works out of the box for the shipped layout, with safe fallback behavior when the layout isn't there. --plugins-dir - Default to <dir-of-binary>/plugins. Missing dir means the control loop stays off (heartbeat only) instead of failing — operators can use the agent without bundling plugins. - A present-but-unsafe dir is a hard error rather than a silent downgrade: the plugin manager execs anything in the dir as the agent's user, so any non-owner write capability there is RCE-equivalent. - Perm check on Unix: owner must be root or the running uid; no group- or world-writable bits. Windows relies on the MSI installer placing the dir under an Administrator-only path and only checks existence here. --nmap-path - Adds nmap support to the agent. NmapModeRequest now executes via the Ullaakut nmap wrapper, then the plugin probe runs against each open host:port (mirrors combined-mode pairing.service). - Default path resolution: explicit flag wins; otherwise prefer <dir-of-binary>/nmap (for the bundled-binary install) if executable; otherwise return "nmap" and let exec.LookPath resolve from PATH at scan time. - Flag must be absolute and point at an executable file. handleCommand now switches on the request mode: IPList runs the existing plugin probe sweep, Nmap runs the new nmap path, and MDNS is rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drop the Nmap rejection in DiscoverOnFleetNode. With the agent shipping a bundled nmap binary (#256), NmapModeRequest now passes through unchanged so the agent can execute it. MDNS still rejected; the agent doesn't run an mDNS listener. Server-side validation: require nmap.target to be non-empty so an all-zero NmapModeRequest is caught before round-tripping through the control stream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Apply ce-simplify findings: - Extract fanOutProbes helper so the bounded-concurrency probe loop is shared between IPList mode (runProbes inline) and the nmap follow-up loop (runNmapDiscovery). The two had identical sem/wg/mu structure with different concurrency caps; now both use the same helper with their own constants. - Introduce endpoint{ip, port} struct so IPList ports-x-ips and nmap's open-host list both feed fanOutProbes through one type. - Drop fleetNodeID parameter from streamReports / handleCommand. The ID was used only in log lines; runControlLoop now enriches the logger with fleet_node_id once per session and downstream callers inherit it. - Use nmap.Open constant instead of "open" string literal. - Cache executableDir() once in run() instead of calling it twice (once for resolvePluginsDir, once for resolveNmapPath). - Reset backoff after a stable control session. Previously a single successful reconnect followed by an immediate disconnect kept growing backoff toward the 30s cap; now a session that lives past stableSessionThreshold resets backoff to 1s on the next failure. - Strip narrative doc comments (control loop, nmap, plugins-dir) that restated the implementation. Keep WHY comments where the rationale isn't obvious (plugin-manager exec security, nmap PATH-fallback contract, plugins-dir RCE). Build, vet, lint, and tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`just build-fleetnode` now wires up the layout the agent expects out of the box: server/plugins/ is already adjacent (populated by `just build-plugins`), and this recipe additionally symlinks server/nmap from whatever is on PATH so the agent's NmapMode probe path works without --nmap-path. Both have soft warnings rather than hard failures when the underlying tools aren't present (plugins not yet built, nmap not installed), so developers without the full stack can still build the binary. .gitignore picks up server/nmap (server/fleetnode and server/plugins/* were already ignored). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous build-fleetnode recipe dropped the binary at server/fleetnode and relied on server/plugins/ being adjacent. That breaks the common dev workflow: `just dev` cross-compiles plugins to linux/arm64 inside server/plugins/ for the Docker server container, and the native-host agent can't exec ELF binaries (`exec format error` on every load). Move the agent's full local layout into server/.fleetnode/: - ./.fleetnode/fleetnode (native binary) - ./.fleetnode/plugins/ (native Go plugins via _build-go-plugins-native) - ./.fleetnode/nmap (symlink to system nmap) `server/plugins/` is left untouched for `just dev`'s container, and the agent's binary-adjacent default still Just Works (`./.fleetnode/fleetnode run --state-dir=...`) without any flags. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The ControlStream RPC is bidi; HTTP/1.1 can't carry it, so the agent's gateway client now uses an h2c-prior-knowledge HTTP/2 transport that matches the fleetd server's h2c.NewHandler. Without this the agent's control loop hits "HTTP status 505 HTTP Version Not Supported" on every reconnect attempt while unary heartbeats keep working (HTTP/1.1 was fine for unary). Side effects: - http.Client.Timeout is removed (it would cap long-lived streams). Per-call deadlines added: heartbeatTimeout in sendHeartbeat, handshakeStepTimeout in RunHandshake's two RPC calls. Existing per-call timeouts (discoveryReportTimeout, ControlStream ctx) already cover their paths. - Test servers wrapped in h2c.NewHandler so they negotiate HTTP/2 prior knowledge like prod does. - TestGatewayHTTPClient_HasTimeout removed -- the constant it asserted against no longer exists. Production HTTPS support is deferred until there's a target prod deployment; the agent currently only speaks h2c plaintext. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two complementary defenses so the agent doesn't leave go-plugin children running when it dies. 1. Catch SIGHUP alongside SIGINT/SIGTERM. Closing the terminal sends SIGHUP; without it the parent dies uncaught and 'defer cleanup()' never fires, so the plugin manager's Shutdown is skipped and the children are reparented to init. 2. Add a startup reaper. On Unix, before constructing the new plugin manager, scan `ps -eo pid,ppid,command` for processes whose parent is init (ppid == 1) and whose binary lives in the resolved plugins directory; SIGKILL them. Covers the uncatchable cases (SIGKILL, panic, OOM) that the SIGHUP fix can't. Windows is a documented no-op: go-plugin children there inherit a job object that terminates them with the parent. Both run early enough to keep the failure mode the user just hit (stranded proto-plugin blocking the next agent's plugin handshake) from happening again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The ppid==1 filter was too narrow. Real-world cases where it missed: - A previous agent's children whose zombie parent hadn't been collected yet, so the kernel had not re-parented them to init at the moment the new agent's reaper ran. - Children adopted by an arbitrary surviving process tree (e.g. a debug wrapper or shell job) instead of init. The reaper runs before newPluginDiscoverer, so by definition the current agent has not yet spawned anything in pluginsDir. Anything we see whose executable lives there is leftover from a prior run, regardless of ppid. Drop the filter and exclude only the agent's own PID for safety. Concurrent agents on the same pluginsDir are already impossible because of state.lock contention, so widening the reaper here cannot affect a legitimate sibling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three changes to prevent the silent-hang footgun we just hit. 1. Lock acquisition is non-blocking (LOCK_EX|LOCK_NB). On contention, read the owner PID recorded inside the lockfile and either tell the operator who to kill or, if the recorded PID is dead, hint that a subprocess inherited the FD. Previously the lock would block forever with no log output -- the new behavior is a clear error in <100ms. 2. Log "acquiring state lock" at the call site so the agent always has a visible checkpoint between plugin startup and lock acquisition. The user-facing symptom of the earlier hang was zero log output; now there's always a last-line indicating exactly how far startup got. 3. Move the plugin reaper + newPluginDiscoverer INSIDE the state lock. The reaper is broad enough to kill any plugin process in our pluginsDir; if a sibling agent already owns the lock, running the reaper would murder its plugin children. Reaping only after we own the lock guarantees we're the authoritative agent and any leftover plugins are real strays. Verified with a smoke test: two `fleetnode run` invocations against the same state-dir -- second errors in ~60ms with the new message and the first agent's daemon + plugins survive untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The bidi http2 response body parks in sync.Cond.Wait, which is not ctx-aware. So stream.Receive() in the control loop would block past the parent ctx's lifetime even after SIGINT triggered cancellation -- Ctrl+C left the daemon running indefinitely. Wrap the session with a tiny watcher goroutine: when ctx fires, force stream.CloseRequest() + CloseResponse(). That broadcasts the cond and Receive returns with "canceled: http2: response body closed". Session cleans up, control loop exits on its outer ctx check, runLocked returns, plugin manager shuts down. Also drop the plugin-manager grace period from 5s to 2s. Shutdown waits the full timer regardless of whether the plugin already exited, so 5s per plugin was 10s of dead time on every clean exit. 2s keeps the safety net for genuinely stuck plugins while bringing clean shutdown to ~4s total. Smoke-tested: SIGINT now returns in ~4s vs hanging indefinitely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Some IDE terminal panels and shell setups silently filter or buffer stderr (FD 2), making the agent appear hung when it was logging the whole time. Slog now writes to stdout, matching the convention used by other CLI daemons (caddy, prometheus, vault). Redirecting with `2>&1` and piping through tee still work; the only behavior change is that the structured-log stream is now on FD 1 instead of FD 2. go-plugin's hclog noise (debug `plugin address` lines, plugin manager Shutdown messages) stays on stderr -- those are secondary and we want the agent's own logs to win the visible-by-default slot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Extract testutil.NewH2CServer / testutil.NewH2CClient so the five inline httptest.NewUnstartedServer(h2c.NewHandler(...)) + h2c transport copies across cmd/fleetnode and fleetnodebootstrap test files share one canonical form. - Collapse the symmetric handshake.go WithTimeout pair into a withHandshakeTimeout helper; the previous inline pattern was easy to forget on a future third step. - Swap the defer order in runControlSession so close(done) runs before the stream-close defer. The watcher goroutine now retires via its quiet path on every normal return, avoiding a redundant close race when ctx happens to cancel at the same moment. - Replace the no-op `if err := writeLockOwnerPID(f); err != nil { _ = err }` block with `_ = writeLockOwnerPID(f)` plus a single one-liner explaining the best-effort intent. - Trim narrative comments on newGatewayHTTPClient, runControlSession, reapOrphanedPlugins, and the stdout-logging Run shim. Keep only the non-obvious WHY. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pass through the recent code aggressively removing comments that restated implementation, narrated change history, or padded the WHY into multi-sentence prose. Project preference is default to no comments; non-obvious rationale stays but in one or two lines. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vnish (and the rest of the asic-rs firmware family) live in the Rust asicrs plugin, not the Go antminer plugin. Until now build-fleetnode only staged the Go plugins, so the agent couldn't discover any asic-rs-only firmware on macOS dev. - Parameterize _asicrs-build with outdir (default kept at server/plugins so build-plugins/test-contract are unaffected). - Build natively via local cargo on Darwin instead of docker buildx; the docker path produces a Linux ELF that macOS can't exec. Linux hosts (CI) keep the docker path so they don't need Rust installed. - Add the new dep to build-fleetnode so server/.fleetnode/plugins/ contains asicrs-plugin + asicrs-config.yaml alongside the Go plugins. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…README - HTTPS gateway client is now scheme-aware: TLS-validating http2.Transport for https://, h2c for http://. Prior shared DialTLSContext shim silently bypassed TLS for https://, defeating ValidateServerURL. - validateNmapTarget restricts server-supplied targets to IP/CIDR/range/ hostname; rejects leading dashes and shell metacharacters so a compromised server cannot pivot via nmap flags like -iL/etc/passwd or -oN. - Drop --plugins-dir and --nmap-path CLI flags. Installer always uses <exe-dir>/plugins and <exe-dir>/nmap; operator overrides only widen the attack surface. nmap falls back to PATH for dev machines. - Add server/cmd/fleetnode/README.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Per-file plugin validation: lstat each entry under <exe-dir>/plugins, reject symlinks, require regular file owned by root or current uid with no group/world write. Container-only check missed writable individual binaries and symlinks to elsewhere. - Bound discovery commands: cap IPs at 1024, ports at 10 (matches pairing.MaxPortsPerIP), wall-clock at 10m (matches pairing default). Reject oversized requests with a clear error. - IPv6 nmap: detect IPv6 literal / hostname-only-IPv6 and pass -6 via nmap.WithIPv6Scanning(); reject IPv6 CIDR consistently with pairing service. Extract buildNmapOptions for testability. - Per-RPC timeout on Register: wrap with withHandshakeTimeout so blackholed servers can't hang enroll while it holds the state lock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The pairing service rejects scoped IPv6, rejects link-local IPv6, canonicalizes IPv6 spelling, and resolves hostnames to IP literals before probing. The fleetnode agent previously passed server-supplied ip_addresses straight through, so hostname inputs probed successfully but then failed ReportDiscoveredDevices buf.validate (ip: true), and scoped or link-local IPv6 inputs reached the dialer with no chance of succeeding. - New helper netutil.NormalizeIPListEntry with an IPListResolver interface so DNS can be stubbed in tests. - Sentinel errors (ErrEmptyTarget, ErrScopedIPv6, ErrLinkLocalIPv6, ErrHostnameUnresolved) so callers can branch on cause when logging skips. - pairing/service.go::DiscoverWithIPList: ~45 lines of inline normalization replaced with a single call to the helper. - cmd/fleetnode/control.go: normalize each ip_address before fan-out, skip failures and log, ack failure if no entry normalizes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous cap on len(ports) didn't prevent a single entry from expanding to thousands of probes: nmap honors range ("1-65535") and comma ("80,443,8080,...") syntax inside a single string. Plugin-default ports were also unchecked. - resolveAndValidatePorts requires every entry to be a decimal TCP port in 1..65535. Range syntax, comma syntax, protocol prefixes ("T:80"), whitespace, and service names are rejected. - Dedupes the result and re-caps at maxPortsPerIP. - Applied uniformly across IpList and Nmap dispatch, and to the plugin-supplied default fallback. - runNmapDiscovery / buildNmapOptions now take the validated port list so they never see the raw request. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 18, 2026 22:55

ankitgoswami requested a review from a team as a code owner May 18, 2026 22:55

github-actions Bot assigned ankitgoswami May 18, 2026

github-actions Bot added javascript Pull requests that update javascript code client server shared labels May 18, 2026

ankitgoswami mentioned this pull request May 18, 2026

feat(fleetnode): discovery + pairing E2E (server) #235

Draft

9 tasks

Copilot started reviewing on behalf of ankitgoswami May 18, 2026 22:56 View session

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

Copilot AI reviewed May 18, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

ankitgoswami mentioned this pull request May 18, 2026

proto: discovery + pairing wire contracts #257

Merged

5 tasks

ankitgoswami force-pushed the ankitg/fleetnode-agent-discovery branch from 9216c9d to 977b22d Compare May 19, 2026 21:15

github-actions Bot removed javascript Pull requests that update javascript code client shared labels May 19, 2026

ankitgoswami marked this pull request as draft May 19, 2026 21:21

ankitgoswami and others added 2 commits May 19, 2026 15:31

github-actions Bot added the automation label May 19, 2026

ankitgoswami and others added 5 commits May 19, 2026 16:14

ankitgoswami and others added 6 commits May 20, 2026 11:41

github-actions Bot added the documentation Improvements or additions to documentation label May 20, 2026

ankitgoswami and others added 3 commits May 20, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(fleetnode): discovery scan + ReportDiscoveredDevices agent#256

feat(fleetnode): discovery scan + ReportDiscoveredDevices agent#256
ankitgoswami wants to merge 18 commits into
mainfrom
ankitg/fleetnode-agent-discovery

ankitgoswami commented May 18, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Uh oh!

ankitgoswami May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 18, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ankitgoswami commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Agent invocation

Control loop (control.go)

Plugins

Nmap

Transport security

Identifier synthesis

Docs

Tests

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

ankitgoswami May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔐 Codex Security Review

Review Summary

Findings

[HIGH] Fleetnode Executes Discovery Against Unrestricted Targets

[MEDIUM] Windows Plugin Loading Skips ACL and File Validation

Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ankitgoswami commented May 18, 2026 •

edited

Loading

Control loop (`control.go`)

github-actions Bot commented May 18, 2026 •

edited

Loading