Skip to content

Comments

Simplify calls to GS#4928

Open
mgrange1998 wants to merge 9 commits intofacebook:mainfrom
mgrange1998:export-D89754292
Open

Simplify calls to GS#4928
mgrange1998 wants to merge 9 commits intofacebook:mainfrom
mgrange1998:export-D89754292

Conversation

@mgrange1998
Copy link
Contributor

Summary:
Moves generation exception handling and trial creation from the Orchestrator into the GenerationStrategy, consolidating the flow and reducing the Orchestrator's complexity.

Key changes:

  • Adds GenerationStrategy.gen_handle_exceptions, which wraps gen() and catches known non-fatal exceptions (OptimizationComplete, DataRequiredError, MaxParallelismReachedException, AxGenerationException, OptimizationConfigRequired), returning them as structured reason strings instead of raising.
  • Inlines the trial creation logic (previously in _get_next_trials and _gen_new_trials_from_generation_strategy) directly into generate_candidate_trials, removing both helper methods entirely (~130 lines removed).
  • Removes a now-unnecessary test in test_ax_sweep_orchestrator that tested the deleted _gen_new_trials_from_generation_strategy method.

Differential Revision: D89754292

Matthew Grange and others added 9 commits February 19, 2026 13:00
…tch_utils

Summary: Renames the `max_parallelism` parameter to `max_concurrency` across GenerationStep, GenerationNode, and the generation strategy dispatch utilities. Adds backward-compatible deprecated `max_parallelism` parameters with deprecation warnings where the public API is affected (`choose_generation_strategy`). Internal variable names (`sobol_parallelism`, `bo_parallelism`) are renamed to `sobol_concurrency`, `bo_concurrency` for consistency.

Differential Revision: D92457714
Summary: Renames the `parallelism` parameter to `concurrency` in `Client.run_trials()` and adds backward-compatible deprecated `max_parallelism` parameters in `AxClient.create_experiment()` and `AxClient.get_max_parallelism()` → `get_max_concurrency()`. Both include deprecation warnings guiding callers to use the new parameter names, with validation that old and new parameters are not specified simultaneously.

Differential Revision: D93771849
…Settings

Summary: Renames `num_parallel_jobs` to `num_concurrent_jobs` in `BenchmarkExecutionSettings` and all nightly benchmark configurations. Also updates the docstring in `BenchmarkMethod` to reference "pending trials" instead of "parallelism". This is a mechanical rename with no behavioral change.

Differential Revision: D93771883
…ants, and telemetry

Summary: Updates remaining references from "parallelism" to "concurrency" across orchestration, telemetry, early stopping, and other modules. This covers docstrings, comments, constant names (`MAX_PENDING_TRIALS` → `MAX_CONCURRENT_TRIALS`, `DUMMY_MAX_PENDING_TRIALS` → `DUMMY_MAX_CONCURRENT_TRIALS`), telemetry field names, and variable names in test files. No behavioral changes — purely a terminology alignment.

Differential Revision: D93771906
…tDesign.concurrency_limit`

Summary: As titled, adding a simple `ExperimentDesign` object. Putting it into properties for serialization for now, so as to not do duplicate work ahead of the storage refactor implementation (and also in case we change things while working on this stack).

Differential Revision: D89770462
Summary: Migrates all references from `experiment._properties[Keys.EXPERIMENT_TOTAL_CONCURRENT_ARMS]` to `experiment.design.concurrency_limit`, completing the transition to the `ExperimentDesign` dataclass introduced in the prior diff. This affects generation node input constructors (including `ALL_N` and `REPEAT_N`), the Axolotl updater, and associated tests. Also cleans up the `no-commit` code in `generation_node_input_constructors.py` to use the new `concurrency_limit` field with a fallback to a default of 10.

Differential Revision: D89772029
Summary:
## Changes

Consolidates `generate_candidates` and `_prepare_trials` into a unified API:

- Renames `generate_candidates` → `generate_candidate_trials` and changes its return type to a 3-tuple `(existing_candidates, new_trials, error)`, incorporating the existing-candidate-trial logic that was previously in `_prepare_trials`.
- Extracts the capacity/limit calculation from `_prepare_trials` into a new `compute_n_to_generate` method, which the Orchestrator's main loop now calls before `generate_candidate_trials`.
- Renames `should_generate_candidates_for_pts` → `should_generate_candidate_trials_for_pts` and adds a "not enough data" check that validates metrics have at least 1 day of data before allowing generation.
- Adds two new test methods for the "not enough data" and "missing metrics + not enough data" scenarios.

## Devmate session

How doing this with Devmate went:

1. First we ask Devmate to analyse the difference betwen the methods; it does remarkably well:{F1984363089} {F1984363089} {F1984363089}

2. Next a tangent: I renamed `generate_candidates` with a more precise name (`generate_candidate_trials`), since that is the method we will keep between the two, and it might as well have a better name. Asked Devmate to apply the changes throughout fbcode.
 {F1984363157} {F1984363170}

3. Now for the hard part: get `generate_candidate_trials` to match the behavior or `_prepare_trials`, without me writing any of the code:  {F1984363323} {F1984363333}
^ Pretty good for starters! I give corrections, see above; it applies them well: {F1984363346}
Then with one more small correction, we have a very solid plan:  {F1984363398}, which Devmate implements:  {F1984363406}  {F1984363458}. I think it did really well!

Differential Revision: D89750211
Summary:
Adds an `on_generation_error` callback parameter to `generate_candidate_trials` and a corresponding `on_generation_error` function in Axolotl utils. This allows callers like Axolotl to format error messages (including paste upload with full traceback) without the Orchestrator needing to know about paste infrastructure.

The `generate_candidate_trials` return type changes from a 3-tuple to a 4-tuple, adding a `cannot_generate_reason` string that the callback populates when generation fails. The existing candidate trial accounting is also moved from `generate_candidate_trials` into `compute_n_to_generate`, so the `n` parameter now represents exactly the number of new trials to generate.

Differential Revision: D89751541
Summary:
Moves generation exception handling and trial creation from the Orchestrator into the GenerationStrategy, consolidating the flow and reducing the Orchestrator's complexity.

Key changes:
- Adds `GenerationStrategy.gen_handle_exceptions`, which wraps `gen()` and catches known non-fatal exceptions (`OptimizationComplete`, `DataRequiredError`, `MaxParallelismReachedException`, `AxGenerationException`, `OptimizationConfigRequired`), returning them as structured reason strings instead of raising.
- Inlines the trial creation logic (previously in `_get_next_trials` and `_gen_new_trials_from_generation_strategy`) directly into `generate_candidate_trials`, removing both helper methods entirely (~130 lines removed).
- Removes a now-unnecessary test in `test_ax_sweep_orchestrator` that tested the deleted `_gen_new_trials_from_generation_strategy` method.

Differential Revision: D89754292
@meta-cla meta-cla bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 20, 2026
@meta-codesync
Copy link

meta-codesync bot commented Feb 20, 2026

@mgrange1998 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89754292.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants