Skip to content

Add a prep recipe engine for inspectable data cleaning workflows #47

@daniel-thom

Description

@daniel-thom

Summary

Add a prep recipe engine for inspectable, reversible data-cleaning and reshaping operations that AI can suggest but not execute opaquely.

Problem

If datasight adds AI-assisted cleaning, it needs a constrained execution model. Free-form AI SQL or code is hard to trust, explain, preview, and export.

Proposed scope

  • Define a versioned prep recipe format.
  • Support an initial set of operations such as:
    • unpivot / melt
    • cast / rename
    • split composite columns
    • fill missing timestamps
    • resample grain
    • interpolate numeric values
    • forward-fill selected fields
  • Provide preview and apply flows.
  • Allow AI to propose recipes or recipe fragments, with the system validating and executing them.

CLI sketch

datasight prep suggest
datasight prep apply recipe.yaml

Acceptance criteria

  • There is a concrete recipe schema with versioning.
  • Recipes can be previewed before apply.
  • The first supported operations cover untidy reshaping and time-series gap handling.
  • AI-generated prep suggestions compile to the same recipe model used by deterministic workflows.
  • Tests cover recipe validation and execution for core operations.

Notes

  • This is a foundational issue for guided data cleaning and untidy-data remediation.
  • Keeping this layer inspectable is more important than making the first version broad.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions