Skip to content

feat: Improve generation failure reporting for schema and timeout failures#416

Merged
eric-tramel merged 6 commits intomainfrom
feature/generation-error-reporting-v1
Mar 13, 2026
Merged

feat: Improve generation failure reporting for schema and timeout failures#416
eric-tramel merged 6 commits intomainfrom
feature/generation-error-reporting-v1

Conversation

@eric-tramel
Copy link
Contributor

@eric-tramel eric-tramel commented Mar 13, 2026

Summary

  • preserve concise parser/schema validation details when ModelFacade exhausts correction attempts
  • include column name, failure category, and primary cause in the dropped-record warning emitted by ColumnWiseDatasetBuilder
  • keep timeout failures visible in the same reporting path without dumping raw model responses

What changed

  • GenerationValidationFailureError now carries normalized validation detail from the underlying ParserException
  • handle_llm_exceptions() includes that preserved detail in ModelGenerationValidationFailureError
  • failure_kind now propagates through ModelGenerationValidationFailureError and is consumed directly by the dataset-builder warning path
  • fan-out work items now pass column_name into the worker error callback for both thread and async execution paths
  • dropped-record warnings are reformatted to show record index, column, failure kind, and a concise Detail: line
  • tests cover schema-validation warnings, timeout warnings, preserved parser detail, and worker context propagation for both thread and async fan-out paths

Classification notes

  • Pydantic validation failures and JSON Schema validation failures are both reported as schema validation
  • missing fenced structured output (for example, no ````jsonblock) and missing fenced code output are reported asparse error`
  • this PR does distinguish validation failures from fence-extraction/parsing failures
  • this PR does not yet distinguish Pydantic validation from JSON Schema validation; both remain grouped under schema validation

Reporting examples

Before:

⚠️ Generation for record at index 248 failed. Will omit this record from the dataset.

After (schema validation):

⚠️ Generation for record at index 248 failed in column 'test_column' (schema validation). Will omit this record from the dataset. Detail: The model output from 'test-model' could not be parsed into the requested format while running generation for column 'test_column'. Validation detail: Response doesn't match requested <response_schema> 'name' is a required property.

After (parse error: missing fenced JSON/code output):

⚠️ Generation for record at index 42 failed in column 'test_column' (parse error). Will omit this record from the dataset. Detail: No parsable JSON structure within ```json markdown fence.

After (timeout):

⚠️ Generation for record at index 17 failed in column 'test_column' (timeout). Will omit this record from the dataset. Detail: The request to model 'test-model' timed out while running generation for column 'test_column'.

Testing

UV_CACHE_DIR=/tmp/uv-cache uv run --no-sync --group dev pytest packages/data-designer-engine/tests/engine/models packages/data-designer-engine/tests/engine/dataset_builders
UV_CACHE_DIR=/tmp/uv-cache uv run --no-sync --group dev ruff check packages/data-designer-engine/src/data_designer/engine/models/errors.py packages/data-designer-engine/src/data_designer/engine/models/facade.py packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py packages/data-designer-engine/tests/engine/models/test_model_errors.py packages/data-designer-engine/tests/engine/models/test_facade.py packages/data-designer-engine/tests/engine/dataset_builders/test_column_wise_builder.py

@eric-tramel eric-tramel requested a review from a team as a code owner March 13, 2026 21:00
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR significantly improves the observability of generation failures by threading structured detail and failure_kind attributes through the exception chain — from ParserException caught in ModelFacade, through GenerationValidationFailureError, all the way to ModelGenerationValidationFailureError consumed by ColumnWiseDatasetBuilder. Dropped-record warnings now include the column name, a categorized failure kind, and a concise single-line cause rather than the previous bare "Will omit this record" message.

Key changes:

  • GenerationValidationFailureError and ModelGenerationValidationFailureError are expanded with detail and failure_kind structured attributes, eliminating duplicate text-based re-classification in the builder.
  • _build_generation_validation_error() and _classify_generation_failure_kind() centralise parser-error classification at the ModelFacade level.
  • Both _fan_out_with_threads and _fan_out_with_async now pass column_name in the worker context dict.
  • _worker_error_callback now raises RuntimeError on missing context rather than silently skipping the failed record.
  • handle_llm_exceptions strips trailing periods from detail before composing the cause string, preventing double punctuation.
  • New tests cover schema-validation warnings, timeout warnings, preserved parser detail, and column_name propagation for both concurrency paths.

Two issues remain worth attention:

  • In _make_error_callback, progress_tracker.record_failure() is never reached when _worker_error_callback raises RuntimeError (the context-missing path), leaving the progress bar one count short for that slot.
  • The "schema" in detail fallback keyword in _classify_worker_failure is broad enough that unrelated errors whose formatted cause mentions "schema" will be mis-labelled "schema validation" in the warning log.

Confidence Score: 3/5

  • Safe to merge with minor fixes — core reporting logic is correct, but two small behavioral gaps remain in the error-callback path.
  • The primary goal of the PR (structured failure reporting with detail and failure_kind propagation) is implemented correctly and is well-tested end-to-end. The two remaining issues — progress_tracker.record_failure() not being called on the RuntimeError path, and the overly broad "schema" in detail fallback — are real logic gaps. The RuntimeError-path issue only manifests for a context=None case that should never occur in production given the updated fan-out code, but the absence of a test covering _make_error_callback means the gap is invisible in CI. The broad keyword match could produce misleading classification labels in operator logs for non-schema errors. Neither issue blocks functionality today, but both should be addressed before the pattern is extended further.
  • column_wise_builder.py — specifically _classify_worker_failure (lines 527–528) and the _make_error_callback / _worker_error_callback interaction (lines 418–423, 545–550).

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py Adds column_name to fan-out context, new _classify_worker_failure / _extract_failure_detail / _format_worker_failure_warning helpers, and a fail-loud RuntimeError in _worker_error_callback. Two issues: progress_tracker.record_failure() is skipped when RuntimeError is raised, and the "schema" in detail fallback check is too broad.
packages/data-designer-engine/src/data_designer/engine/models/errors.py Introduces _normalize_error_detail, expands GenerationValidationFailureError and ModelGenerationValidationFailureError with structured detail and failure_kind attributes, and fixes the trailing-period double-punctuation bug. Changes look correct and well-tested.
packages/data-designer-engine/src/data_designer/engine/models/facade.py Adds _classify_generation_failure_kind and _build_generation_validation_error helpers; all four ParserException raise sites in generate and agenerate now use the helper to carry structured detail and failure_kind through to GenerationValidationFailureError. Logic is symmetric and well-tested.
packages/data-designer-engine/tests/engine/dataset_builders/test_column_wise_builder.py Adds tests for schema-validation detail, timeout detail, missing-context RuntimeError, and column_name propagation in both thread and async fan-out paths. Coverage is solid for the happy and fail-loud paths, but no test covers _make_error_callback to verify progress_tracker.record_failure() is (or is not) called when RuntimeError propagates.
packages/data-designer-engine/tests/engine/models/test_facade.py New tests verify that both generate and agenerate surface detail and failure_kind on the raised ModelGenerationValidationFailureError. Tests are clear and symmetric.
packages/data-designer-engine/tests/engine/models/test_model_errors.py Updated parametrized fixture and two new standalone tests confirm failure_kind propagation and double-period stripping in handle_llm_exceptions. All look correct.

Sequence Diagram

sequenceDiagram
    participant MF as ModelFacade.generate()
    participant BVE as _build_generation_validation_error()
    participant GVFE as GenerationValidationFailureError
    participant HLE as handle_llm_exceptions()
    participant MGVFE as ModelGenerationValidationFailureError
    participant WEC as _worker_error_callback()
    participant CWF as _classify_worker_failure()
    participant EFD as _extract_failure_detail()
    participant LOG as logger.warning()

    MF->>MF: parser(response) → raises ParserException
    MF->>BVE: _build_generation_validation_error(summary, exc)
    BVE->>BVE: _classify_generation_failure_kind(exc)
    Note right of BVE: "schema_validation" or "parse_error"
    BVE-->>MF: GenerationValidationFailureError(summary, detail, failure_kind)
    MF->>HLE: catch_llm_exceptions → handle_llm_exceptions(exc)
    HLE->>HLE: detail_text = exc.detail.rstrip(".")
    HLE-->>MGVFE: raise ModelGenerationValidationFailureError(msg, detail, failure_kind)
    MGVFE->>WEC: _worker_error_callback(exc, context={index, column_name})
    WEC->>WEC: _format_worker_failure_warning(exc, context)
    WEC->>CWF: _classify_worker_failure(exc)
    CWF->>CWF: getattr(exc, "failure_kind") → "schema_validation"
    CWF-->>WEC: "schema validation"
    WEC->>EFD: _extract_failure_detail(exc)
    EFD->>EFD: getattr(exc, "detail") → structured detail string
    EFD-->>WEC: detail string
    WEC->>LOG: "⚠️ Generation for record at index N failed in column 'X' (schema validation). Detail: ..."
    WEC->>WEC: _records_to_drop.add(context["index"])
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py
Line: 545-550

Comment:
**`progress_tracker.record_failure()` skipped when `RuntimeError` is raised**

`_make_error_callback` calls `progress_tracker.record_failure()` only if `_worker_error_callback` returns normally. When `context` is `None` or missing `"index"`, `_worker_error_callback` raises `RuntimeError` before returning, so `record_failure()` is never reached. This means the progress tracker will undercount failures for that execution slot.

In the thread executor path the `RuntimeError` is caught by Python's done-callback machinery and swallowed (as already noted). In the async path it is caught by the explicit `try/except Exception` in `_run_task`. In both cases `record_failure()` is never called, leaving the progress bar one count short.

A minimal fix is to call `record_failure()` before the guard raises:

```python
def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
    """If a worker fails, we can handle the exception here."""
    logger.warning(self._format_worker_failure_warning(exc, context=context))
    if context is None or "index" not in context:
        raise RuntimeError("Worker error callback called without a valid context index.")
    self._records_to_drop.add(context["index"])
```

Alternatively, move `progress_tracker.record_failure()` into a `finally` block inside the wrapper in `_make_error_callback` so it is always called regardless of whether `_worker_error_callback` raises.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py
Line: 527-528

Comment:
**Overly broad `"schema" in detail` fallback may misclassify unrelated errors**

The condition `"schema" in detail` runs for any exception that does not carry a structured `failure_kind` attribute (e.g., `ModelStructuredOutputError`, `ModelBadRequestError`, or future custom errors). If such an error's formatted cause string happens to contain the word "schema" — for instance, a bad-request error referencing an "API schema" or a structured-output error mentioning "output schema" — it will be labelled `"schema validation"` in the warning, which can mislead operators diagnosing the root cause.

The first sub-condition `"response_schema" in detail` is already highly specific. Replacing the broader fallback with an equally precise keyword (or removing it) would prevent false-positive classification:

```python
if "response_schema" in detail or "model_validate" in detail or "doesn't match requested" in detail:
    return "schema validation"
```

Alternatively, demote the catch-all `"schema"` check to after the `"validation"` check so that less-specific matches are only reached if nothing else fits.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: f0f12fe

@eric-tramel
Copy link
Contributor Author

Follow-up update pushed in b06c4734 to address the Greptile notes:

  • failure_kind now propagates through ModelGenerationValidationFailureError and is consumed directly by the dataset-builder warning path.
  • Added test_fan_out_with_async_passes_column_name_in_context for async parity with the thread path.
  • Removed the redundant pre-normalization in facade.py.

Re-ran:
UV_CACHE_DIR=/tmp/uv-cache uv run --no-sync --group dev pytest packages/data-designer-engine/tests/engine/models packages/data-designer-engine/tests/engine/dataset_builders

@eric-tramel eric-tramel changed the title Improve generation failure reporting for schema and timeout failures feat: Improve generation failure reporting for schema and timeout failures Mar 13, 2026
@eric-tramel eric-tramel self-assigned this Mar 13, 2026
@eric-tramel
Copy link
Contributor Author

Wanting to go ahead and get some more useful logging in to help for parsing structured output failures.

@eric-tramel
Copy link
Contributor Author

Follow-up update pushed in df9cfe11 to address the newest Greptile notes:

  • _worker_error_callback() now fails loudly if the context index is missing, rather than silently keeping the failed record.
  • Added async parity coverage for agenerate() to verify validation detail and failure_kind propagation.
  • Removed the double-punctuation risk in handle_llm_exceptions() when validation detail already ends with a period.

Re-ran:
UV_CACHE_DIR=/tmp/uv-cache uv run --no-sync --group dev pytest packages/data-designer-engine/tests/engine/models packages/data-designer-engine/tests/engine/dataset_builders

Comment on lines 545 to 550
def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
"""If a worker fails, we can handle the exception here."""
logger.warning(
f"⚠️ Generation for record at index {context['index']} failed. "
f"Will omit this record from the dataset.\n{exc}"
)
logger.warning(self._format_worker_failure_warning(exc, context=context))
if context is None or "index" not in context:
raise RuntimeError("Worker error callback called without a valid context index.")
self._records_to_drop.add(context["index"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeError silently swallowed by both executor paths

The RuntimeError raised here is intended to provide fail-loud behaviour when context is missing, but both executor implementations prevent it from ever reaching the main thread:

  • Thread path (ConcurrentThreadExecutor): _error_callback is called at line 188 of concurrency.py, outside any try/except inside _handle_future. _handle_future is registered as a done_callback via future.add_done_callback(_handle_future). Python's concurrent.futures catches and only logs (via LOGGER.exception) any exception raised from a done-callback — it does not propagate to the caller of _fan_out_with_threads.
  • Async path (AsyncConcurrentExecutor): The callback is already wrapped in an explicit try/except Exception at lines 215–219 of async_concurrency.py, which catches the RuntimeError and logs "error_callback raised an exception".

In both production paths the RuntimeError is silently swallowed. The test test_worker_error_callback_requires_context_index exercises the direct method call — not the integrated path — so this gap is not caught by the test suite.

Because the _records_to_drop.add(context["index"]) line is never reached when context is missing, the failed record is also not dropped from the dataset (the exact silent-data-corruption risk flagged in a previous review comment).

Consider moving the context validation to the outermost fan-out site (before the executor loop) where the exception can propagate freely, or guard the record-drop with a logged error and a bare return so the callback's side-effect is clearly skipped without the false-safe of an unreachable raise.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py
Line: 545-550

Comment:
**`RuntimeError` silently swallowed by both executor paths**

The `RuntimeError` raised here is intended to provide fail-loud behaviour when context is missing, but both executor implementations prevent it from ever reaching the main thread:

- **Thread path** (`ConcurrentThreadExecutor`): `_error_callback` is called at line 188 of `concurrency.py`, outside any try/except inside `_handle_future`. `_handle_future` is registered as a `done_callback` via `future.add_done_callback(_handle_future)`. Python's `concurrent.futures` catches and only logs (via `LOGGER.exception`) any exception raised from a done-callback — it does **not** propagate to the caller of `_fan_out_with_threads`.
- **Async path** (`AsyncConcurrentExecutor`): The callback is already wrapped in an explicit `try/except Exception` at lines 215–219 of `async_concurrency.py`, which catches the `RuntimeError` and logs `"error_callback raised an exception"`.

In both production paths the `RuntimeError` is silently swallowed. The test `test_worker_error_callback_requires_context_index` exercises the direct method call — not the integrated path — so this gap is not caught by the test suite.

Because the `_records_to_drop.add(context["index"])` line is never reached when context is missing, the failed record is also not dropped from the dataset (the exact silent-data-corruption risk flagged in a previous review comment).

Consider moving the context validation to the outermost fan-out site (before the executor loop) where the exception can propagate freely, or guard the record-drop with a logged error and a bare `return` so the callback's side-effect is clearly skipped without the false-safe of an unreachable `raise`.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved — see review comment below.

@nabinchha
Copy link
Contributor

Findings

Critical — Must fix before merge

No critical issues found.

Warnings — Strongly recommend fixing

column_wise_builder.py:548-549RuntimeError is unreachable in production

  • What: The RuntimeError("Worker error callback called without a valid context index.") raised at line 549 will never propagate to the main thread in either executor path. In the thread executor, _error_callback is invoked inside a done_callback registered via future.add_done_callback — Python's concurrent.futures silently swallows exceptions from done callbacks. In the async executor, the callback is wrapped in an explicit try/except Exception (line 216-219 of async_concurrency.py) that catches and logs the RuntimeError.
  • Why: The raise creates a false sense of safety. The warning message claims "Will omit this record from the dataset" but _records_to_drop.add(...) is never reached, so the corrupted record silently remains. The existing test (test_worker_error_callback_requires_context_index) exercises the method directly, not through the executor, so it doesn't catch this gap.
  • Suggestion: Replace the raise with a logged error and explicit return. The defensive check is still valuable — it just shouldn't rely on an exception that can't propagate:
def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
    """If a worker fails, we can handle the exception here."""
    logger.warning(self._format_worker_failure_warning(exc, context=context))
    if context is None or "index" not in context:
        logger.error("Worker error callback called without a valid context index; record cannot be dropped.")
        return
    self._records_to_drop.add(context["index"])

This was flagged in a prior review; the current commit addressed it by adding the RuntimeError, but the underlying swallowing issue in both executors remains.


column_wise_builder.py:540-542 — Warning message is misleading when index is unknown

  • What: _format_worker_failure_warning always says "Will omit this record from the dataset" even when record_index is "unknown" — i.e., when the record cannot actually be omitted because there's no valid index to add to _records_to_drop.
  • Why: Misleading log messages make incident investigation harder. An operator reading "Will omit this record" would assume the corrupted record was dropped.
  • Suggestion: Adjust the message when the index is unknown:
omit_msg = (
    "Will omit this record from the dataset."
    if record_index != "unknown"
    else "Record index is unknown; this record may not be omitted correctly."
)

This was also flagged in a prior review and remains unaddressed.


column_wise_builder.py:508-531 — Fallback text classification in _classify_worker_failure can diverge from structured failure_kind

  • What: When failure_kind is not set on the exception (e.g., for non-ModelGenerationValidationFailureError errors), _classify_worker_failure falls back to string matching on the detail and exception class name. Several of these heuristics are fragile — for example, "schema" in detail would match any error message that happens to contain the word "schema" even if it's not a schema validation failure. Similarly, "validation" in exc_name would match any custom exception class with "validation" in its name.
  • Why: The structured failure_kind path (line 510-512) is the reliable path. The fallback heuristics are a best-effort classification that could misclassify errors, leading to confusing warning messages.
  • Suggestion: Consider adding a comment documenting that the text-matching branch is best-effort and may misclassify. Also consider making the fallback return value more generic (e.g., just "error" instead of specific categories) to avoid false confidence in the classification.

Suggestions — Consider improving

test_column_wise_builder.py:213-231 — Test for RuntimeError exercises the wrong path

  • What: test_worker_error_callback_requires_context_index calls _worker_error_callback directly, which correctly triggers the RuntimeError. However, in production, this method is only called from within executor callbacks where the RuntimeError is swallowed (as noted in the warning above).
  • Why: The test provides coverage of the method in isolation but doesn't verify the actual production behavior. If the RuntimeError is replaced with a return (per the suggestion above), this test should be updated to verify that _records_to_drop remains empty and the error is logged.
  • Suggestion: If you keep the RuntimeError, add an integration-level test that exercises the error callback through the actual executor to verify the exception's effect (or lack thereof) in the real execution context.

What Looks Good

  • Clean error propagation chain: The failure_kind and detail attributes flow naturally from ParserExceptionGenerationValidationFailureErrorModelGenerationValidationFailureError_classify_worker_failure. The structured attribute is checked first before falling back to text matching, which is the right design.

  • Thorough test coverage: New tests cover schema validation warnings, timeout warnings, preserved parser detail, double-period stripping, missing context handling, and worker context propagation for both thread and async fan-out paths. The parametric facade tests for both sync and async paths are well-structured.

  • Good separation of concerns: _extract_failure_detail, _classify_worker_failure, and _format_worker_failure_warning are cleanly separated as static/class methods, making them independently testable and easy to reason about.

Verdict

Ship it (with nits) — The core error reporting improvement is well-designed and thoroughly tested. The RuntimeError swallowing issue (carried over from a prior review) is the most substantive concern but is a defensive-path issue that only matters if a future code change accidentally omits context — which is unlikely given the current fan-out code always provides it. The misleading warning message for unknown indices is a minor UX issue. Neither blocks merge.

Next steps

  • Consider addressing the RuntimeErrorlogger.error + return change as a quick follow-up, since it was flagged in the prior review round and the current fix doesn't fully resolve it.
  • The warning message wording for unknown indices could be tightened in the same follow-up.

@eric-tramel eric-tramel merged commit 5783435 into main Mar 13, 2026
47 checks passed
Comment on lines 545 to 550
def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
"""If a worker fails, we can handle the exception here."""
logger.warning(
f"⚠️ Generation for record at index {context['index']} failed. "
f"Will omit this record from the dataset.\n{exc}"
)
logger.warning(self._format_worker_failure_warning(exc, context=context))
if context is None or "index" not in context:
raise RuntimeError("Worker error callback called without a valid context index.")
self._records_to_drop.add(context["index"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

progress_tracker.record_failure() skipped when RuntimeError is raised

_make_error_callback calls progress_tracker.record_failure() only if _worker_error_callback returns normally. When context is None or missing "index", _worker_error_callback raises RuntimeError before returning, so record_failure() is never reached. This means the progress tracker will undercount failures for that execution slot.

In the thread executor path the RuntimeError is caught by Python's done-callback machinery and swallowed (as already noted). In the async path it is caught by the explicit try/except Exception in _run_task. In both cases record_failure() is never called, leaving the progress bar one count short.

A minimal fix is to call record_failure() before the guard raises:

def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
    """If a worker fails, we can handle the exception here."""
    logger.warning(self._format_worker_failure_warning(exc, context=context))
    if context is None or "index" not in context:
        raise RuntimeError("Worker error callback called without a valid context index.")
    self._records_to_drop.add(context["index"])

Alternatively, move progress_tracker.record_failure() into a finally block inside the wrapper in _make_error_callback so it is always called regardless of whether _worker_error_callback raises.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py
Line: 545-550

Comment:
**`progress_tracker.record_failure()` skipped when `RuntimeError` is raised**

`_make_error_callback` calls `progress_tracker.record_failure()` only if `_worker_error_callback` returns normally. When `context` is `None` or missing `"index"`, `_worker_error_callback` raises `RuntimeError` before returning, so `record_failure()` is never reached. This means the progress tracker will undercount failures for that execution slot.

In the thread executor path the `RuntimeError` is caught by Python's done-callback machinery and swallowed (as already noted). In the async path it is caught by the explicit `try/except Exception` in `_run_task`. In both cases `record_failure()` is never called, leaving the progress bar one count short.

A minimal fix is to call `record_failure()` before the guard raises:

```python
def _worker_error_callback(self, exc: Exception, *, context: dict | None = None) -> None:
    """If a worker fails, we can handle the exception here."""
    logger.warning(self._format_worker_failure_warning(exc, context=context))
    if context is None or "index" not in context:
        raise RuntimeError("Worker error callback called without a valid context index.")
    self._records_to_drop.add(context["index"])
```

Alternatively, move `progress_tracker.record_failure()` into a `finally` block inside the wrapper in `_make_error_callback` so it is always called regardless of whether `_worker_error_callback` raises.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +527 to +528
if "response_schema" in detail or "schema" in detail:
return "schema validation"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad "schema" in detail fallback may misclassify unrelated errors

The condition "schema" in detail runs for any exception that does not carry a structured failure_kind attribute (e.g., ModelStructuredOutputError, ModelBadRequestError, or future custom errors). If such an error's formatted cause string happens to contain the word "schema" — for instance, a bad-request error referencing an "API schema" or a structured-output error mentioning "output schema" — it will be labelled "schema validation" in the warning, which can mislead operators diagnosing the root cause.

The first sub-condition "response_schema" in detail is already highly specific. Replacing the broader fallback with an equally precise keyword (or removing it) would prevent false-positive classification:

if "response_schema" in detail or "model_validate" in detail or "doesn't match requested" in detail:
    return "schema validation"

Alternatively, demote the catch-all "schema" check to after the "validation" check so that less-specific matches are only reached if nothing else fits.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py
Line: 527-528

Comment:
**Overly broad `"schema" in detail` fallback may misclassify unrelated errors**

The condition `"schema" in detail` runs for any exception that does not carry a structured `failure_kind` attribute (e.g., `ModelStructuredOutputError`, `ModelBadRequestError`, or future custom errors). If such an error's formatted cause string happens to contain the word "schema" — for instance, a bad-request error referencing an "API schema" or a structured-output error mentioning "output schema" — it will be labelled `"schema validation"` in the warning, which can mislead operators diagnosing the root cause.

The first sub-condition `"response_schema" in detail` is already highly specific. Replacing the broader fallback with an equally precise keyword (or removing it) would prevent false-positive classification:

```python
if "response_schema" in detail or "model_validate" in detail or "doesn't match requested" in detail:
    return "schema validation"
```

Alternatively, demote the catch-all `"schema"` check to after the `"validation"` check so that less-specific matches are only reached if nothing else fits.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants