Skip to content

Add simulation report outputs and regional filtering for API v2 alpha#242

Merged
anth-volk merged 27 commits intomainfrom
app-v2-migration
Mar 5, 2026
Merged

Add simulation report outputs and regional filtering for API v2 alpha#242
anth-volk merged 27 commits intomainfrom
app-v2-migration

Conversation

@anth-volk
Copy link
Contributor

@anth-volk anth-volk commented Mar 3, 2026

Fixes #241

Summary

Adds the simulation report infrastructure required for the app v2 migration onto API v2 alpha. This includes regional filtering, geographic impact outputs, demographic poverty breakdowns, and supporting utilities.

Changes

Core

  • Variable metadata: default_value and value_type fields on Variable
  • Region abstraction: New Region class with registry pattern (core/region.py) and Simulation filtering support

Country-specific regions

  • US: States, congressional districts, and places data (countries/us/)
  • UK: Constituencies and local authorities (countries/uk/)

New output classes

  • CongressionalDistrictImpact — per-district budgetary impact (US)
  • ConstituencyImpact — per-constituency budgetary impact (UK)
  • LocalAuthorityImpact — per-local-authority budgetary impact (UK)
  • IntraDecileImpact — within-decile winners/losers with wealth decile support

Poverty extensions

  • poverty_type field on Poverty output
  • Age group, gender, and race breakdown convenience functions

US model

  • Refactored reform application to construction-time (Microsimulation(reform=...))
  • Extracted build_reform_dict / merge_reform_dicts utilities
  • Added entity variables: is_male, race, household_state_income_tax, household_income_decile, household_count_people, congressional_district_geoid

Utilities

  • entity_utils.py — shared entity-relationship builder and household-level dataset filtering
  • parametric_reforms.py — reform dict construction from parameter values

Tests

  • 14 new test files with fixtures covering regions, filtering, entity utils, poverty demographics, geographic impacts, and reform application

Design note: Simulation approach by country

UK simulations use the Scenario structure, while US simulations use Microsimulation directly, because policyengine-us does not support Scenarios at this time.

Test plan

  • Unit tests pass for all new output classes
  • Region filtering correctly subsets datasets by state/district/constituency/local authority
  • Poverty demographic breakdowns produce correct age/gender/race splits
  • US reform application works at construction time
  • Existing tests remain green (no regressions)

🤖 Generated with Claude Code

anth-volk and others added 27 commits March 3, 2026 18:31
- Add default_value (Any) and value_type (type) fields to Variable model
- Update US and UK models to serialize default_value for JSON compatibility:
  - Enum values converted to their .name string
  - datetime.date values converted to ISO format string
  - Primitives (bool, int, float, str) kept as-is
- value_type preserves the original type for downstream consumers

Closes #226

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add tests verifying US age variable has default_value of 40
- Add tests verifying enum variables have string default_value
- Add tests verifying variables have value_type set
- Add tests verifying UK model variable default_value

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Allow PR checks to run for PRs targeting any branch, not just main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused imports
- Sort imports correctly
- Format code with ruff

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The sample_registry fixture was not being discovered by pytest
after linting removed unused imports. Moving fixture imports to
conftest.py is the standard pytest pattern for shared fixtures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add filter_field and filter_value parameters to Simulation class
- Add _build_entity_relationships() to US and UK models for mapping
  persons to all containing entities
- Add _filter_dataset_by_household_variable() to filter datasets while
  preserving entity integrity
- Apply filtering in run() method when filter parameters are set

This enables filtering datasets by household-level variables like
place_fips (US) or country (UK) for regional analysis.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add filtering_fixtures.py with US and UK test datasets
- Add 18 unit tests for _build_entity_relationships and
  _filter_dataset_by_household_variable methods
- Tests follow given-when-then pattern
- All tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The US country package uses a shared singleton TaxBenefitSystem, which
means p.update() after Microsimulation construction has no effect on
calculations. This fix:

- Adds reform_dict_from_parameter_values() utility to convert
  ParameterValue objects to the dict format accepted by Microsimulation
- Updates US model.py to build reform dict and pass it at construction
  time instead of using simulation_modifier (p.update) after
- Adds comprehensive unit tests for the utility function and US reform
  application

The UK model continues to use p.update() since policyengine-uk was
refactored to give each simulation its own TaxBenefitSystem instance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove unused imports in test_us_reform_application.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract duplicated entity relationship and dataset filtering logic from
US and UK model.py into shared utils/entity_utils.py. Decompose inline
reform dict construction in US run() into single-purpose functions
(build_reform_dict, merge_reform_dicts) in utils/parametric_reforms.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Harmonize poverty type tracking between policyengine.py and the API by
adding poverty_type directly to the Poverty class. Convenience functions
now iterate .items() on the poverty variable dicts to capture both the
type enum and variable name, and include poverty_type in DataFrame output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add AGE_GROUPS dict and calculate_uk/us_poverty_by_age() convenience
functions that compute poverty rates for child (<18), adult (18-64),
and senior (65+) age groups across all poverty types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add is_male to both UK and US entity_variables so it's available in
simulation output datasets. Add GENDER_GROUPS dict and
calculate_uk/us_poverty_by_gender() convenience functions that compute
poverty rates for male and female groups across all poverty types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add race to US entity_variables so it's available in simulation output
datasets. Add RACE_GROUPS dict and calculate_us_poverty_by_race() that
computes poverty rates for white, black, hispanic, and other racial
groups across all US poverty types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Needed for budget summary computation — captures state-level tax
revenue impact (matching V1's state_tax_revenue_impact field).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…y_variables

Both UK and US models now include these household-level variables,
needed for intra-decile income change distribution computation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
16 tests covering calculate_*_poverty_by_age, calculate_*_poverty_by_gender,
and calculate_us_poverty_by_race — verifying delegation, record counts,
filter_variable assignment, and correct filter kwargs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add CongressionalDistrictImpact output that groups households by
congressional_district_geoid and computes per-district weighted average
and relative income changes. Add geoid to US household entity_variables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds ConstituencyImpact output that reweights households using a
pre-computed weight matrix (parliamentary_constituency_weights.h5)
to compute per-constituency average/relative income change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds LocalAuthorityImpact output that reweights households using a
pre-computed weight matrix (local_authority_weights.h5) to compute
per-local-authority average/relative income change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add IntraDecileImpact output class with compute_intra_decile_impacts()
for computing 5-category income change distributions within deciles.
Uses qcut by default for grouping, with decile_variable parameter for
pre-computed groupings (e.g. household_wealth_decile).

Also adds decile_variable parameter to DecileImpact for the same
purpose, and adds household_wealth_decile to UK entity_variables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add tests for CongressionalDistrictImpact, ConstituencyImpact,
LocalAuthorityImpact, IntraDecileImpact, and DecileImpact with
decile_variable support. Use model_construct() in convenience
functions to bypass Pydantic validation for programmatic construction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Narrow exception handling in Simulation.ensure() and UK region CSV loaders
- Add column validation in entity_utils._resolve_id_column with clear error
- Add logging throughout (simulation, entity_utils, regions, UK model)
- Refactor UK model PyPI metadata fetch to lazy-load with timeout
- Fix parametric_reforms return type annotation (dict | None)
- Add filter_group field to Poverty and update convenience functions
- Export demographic poverty helpers from outputs/__init__.py
- Add unit tests for Poverty.run() with all filter combinations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@anth-volk anth-volk marked this pull request as ready for review March 3, 2026 20:02
@anth-volk anth-volk requested a review from nikhilwoodruff March 3, 2026 22:20

# Build reform dict from policy and dynamic parameter values.
# US requires reforms at Microsimulation construction time
# (unlike UK which supports p.update() after construction).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please stress test this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had Claude work up this explanation:

Summary

Here's the full picture of why p.update() silently fails for the US:

Root cause: ParameterNode._at_instant_cache

Every ParameterNode in policyengine-core has a _at_instant_cache dict
that stores ParameterNodeAtInstant snapshots. When a formula calls
parameters(period).gov.irs.income.bracket.rates, the parameter tree
walks down the nodes and each node caches a frozen snapshot for that
instant. There were 172,651 of these cached entries after a single
income_tax calculation.

p.update() mutates the underlying parameter tree, but does not
invalidate these cached snapshots. So even after clearing all variable
holder arrays, when formulas re-execute and call parameters(period),
they hit the stale _at_instant_cache and get the old pre-update
values.

Why reform= at construction works: It creates an entirely new
TaxBenefitSystem with the reform baked into the parameter tree from
the start — before any _at_instant_cache entries exist.

Why the UK's simulation_modifier approach works: The UK creates
Microsimulation(dataset=input_data) and then calls p.update() before
any calculate() calls. Since no formulas have run yet, there are no
cached snapshots to go stale.

Why the US can't use the same pattern: The US uses the shared
default_tax_benefit_system_instance singleton. Even on a fresh
Microsimulation(), if any prior instance already computed variables,
the shared TBS's ParameterNode._at_instant_cache contains stale
snapshots.

This fits with what I found when originally working on this about a one to one and a half months ago. Hoping to isolate this issue for now, move ahead with the code we have in this PR, then revisit later.

@anth-volk anth-volk merged commit 8ce9ffa into main Mar 5, 2026
6 of 7 checks passed
@anth-volk anth-volk deleted the app-v2-migration branch March 5, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add simulation report outputs and regional filtering for API v2 alpha

2 participants