Unify BitWorldWorker and AmongThemNativeWorker step API by sasmith · Pull Request #48 · Metta-AI/bitworld

sasmith · 2026-04-28T17:35:51Z

Summary

Both workers now expose the same step API:

step(action_masks: np.ndarray) -> tuple[frames, rewards]

where action_masks has shape (agent_count,) uint8, frames has shape (agent_count, ...), and rewards has shape (agent_count,) float32.

Why

BitWorldWorker.step previously took a scalar int and returned (frame, reward_delta), so BitWorldVecEnv._step_env dispatched on worker.agent_count == 1 to pick between the scalar API (BitWorldWorker) and the array API (AmongThemNativeWorker):

if worker.agent_count == 1:
    frame, reward = worker.step(int(action_masks[0]))
    ...
else:
    frames, rewards = worker.step(action_masks)
    ...

That condition is a leaky proxy: it conflates "this is the single-agent worker class" with "there is one agent." It silently breaks when Among Them is configured with players=1, since the worker is still AmongThemNativeWorker and still expects array-shaped masks. Hitting the scalar branch trips the expected N Among Them action masks guard inside the native worker.

What changed

BitWorldWorker.step now takes action_masks: np.ndarray of shape (1,), internally unwraps masks[0] to a byte for the connection send, and wraps reward_delta into a 1-element float32 array on return.

BitWorldVecEnv._step_env collapses to a single path:

frames, rewards = worker.step(action_masks)
frames = self._frame_batch(frames, worker)

_frame_batch is unchanged — its existing reshape handles both (1, FRAME_PIXELS) from BitWorldWorker and (agent_count, FRAME_PIXELS) from AmongThemNativeWorker.

No behavior change for either env type at agent_count > 1. The players=1 Among Them path is now correct.

Test plan

Smoke: among_them with --players 1 and --num-envs 8 — runs without the expected N action masks ValueError.
Smoke: among_them with --players 4 and --num-envs 8 — unchanged behavior, no regression in SPS or rewards.
Smoke: a BitWorldWorker-backed env (e.g., bubble_eats or snake) — runs without regression. (Local env had no torch installed, so this was not run on the prep machine.)

https://claude.ai/code/session_01Tb5Fr1Yu8JxTuD5dSRcswa

Generated by Claude Code

Both workers now expose: step(action_masks: np.ndarray) -> tuple[frames, rewards] where action_masks has shape (agent_count,) uint8, frames has shape (agent_count, ...), and rewards has shape (agent_count,) float32. BitWorldWorker.step previously took a scalar int and returned (frame, reward_delta), so the vec env dispatched on worker.agent_count == 1 to choose between the scalar API (BitWorldWorker) and the array API (AmongThemNativeWorker). That condition was a leaky proxy: it conflated "this is the single-agent worker class" with "there is one agent." It silently broke when Among Them was configured with players=1, since the worker is still AmongThemNativeWorker and still expects array-shaped masks. Now BitWorldWorker.step internally unwraps masks[0] to a byte for the connection send and wraps reward_delta into a 1-element float32 array for the return. The vec env's _step_env has a single path: frames, rewards = worker.step(action_masks) frames = self._frame_batch(frames, worker) No behavior change for either env type at agent_count > 1; the 1-player Among Them path is now correct. Smoke testing requires a torch-enabled environment, which this branch was prepared in without; verifying end-to-end on Mac/MPS is left to the reviewer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify BitWorldWorker and AmongThemNativeWorker step API#48

Unify BitWorldWorker and AmongThemNativeWorker step API#48
sasmith wants to merge 1 commit intomasterfrom
claude/unify-worker-step-api

sasmith commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sasmith commented Apr 28, 2026

Summary

Why

What changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants