Skip to content

Add partner_blindness_prob (blind-agent robustness knob)#414

Open
eugenevinitsky wants to merge 1 commit intopuffer-4from
ev/blind-partners
Open

Add partner_blindness_prob (blind-agent robustness knob)#414
eugenevinitsky wants to merge 1 commit intopuffer-4from
ev/blind-partners

Conversation

@eugenevinitsky
Copy link
Copy Markdown

Summary

  • Ports the blind-agent robustness feature from vcha/turbostream. Each agent has a per-episode probability partner_blindness_prob of being "blind" — its partner observations are zeroed for the whole episode, making it an unpredictable hazard for surrounding traffic.
  • Blind agents are masked out of the PPO rollout buffer (GIGAFLOW Appendix B.4) so their transitions don't pollute the gradient.
  • Default 0.0 (off), so behavior is unchanged unless you opt in.

Changes

  • config/drive.ini: new partner_blindness_prob = 0.0 knob under a [Robustness features] block.
  • sim/env_fields.h: wires the kwarg through ENV_FIELDS.
  • sim/drive.h: adds partner_blindness_prob to the Drive struct; per-episode sampling in c_reset; early-return in write_partner_obs for blind egos; mask out blind agents in c_step.
  • sim/datatypes.h: adds is_blind_partner flag to Agent.

Test plan

  • ./build.sh --fast builds clean
  • ./build.sh builds clean (torch backend)
  • With partner_blindness_prob = 0.0: training metrics unchanged vs. base
  • With partner_blindness_prob = 0.05: ~5% of agents per episode see zeroed partner obs and contribute mask=0 entries; check masks distribution in a short rollout
  • Sanity: collision rate doesn't spike with blind agents off

🤖 Generated with Claude Code

Ports the blind-agent feature from vcha/turbostream. Per-episode probability
that an agent sees zeroed partner observations for the whole episode, making
it an unpredictable hazard for the rest of traffic. Blind agents are masked
out of the PPO rollout buffer (GIGAFLOW Appendix B.4) so they don't pollute
the gradient.

Default 0.0 (off).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 20:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in “blind partner observations” robustness feature to the drive simulator, allowing a per-episode fraction of agents to have their partner (other-agent) observations zeroed while excluding their transitions from PPO rollouts to avoid gradient contamination.

Changes:

  • Introduces partner_blindness_prob as a new [env] configuration knob (default 0.0).
  • Wires the new kwarg through ENV_FIELDS into the Drive struct.
  • Implements per-episode sampling of Agent.is_blind_partner, zeros partner observations for blind agents, and sets masks[i]=0 for blind agents in c_step.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
config/drive.ini Adds the new robustness configuration knob and documentation comments.
sim/env_fields.h Exposes partner_blindness_prob via the centralized env-kwarg field list.
sim/datatypes.h Adds an episode-level Agent.is_blind_partner flag.
sim/drive.h Samples blindness per episode, skips writing partner observations for blind agents, and masks blind agents out of PPO rollouts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants