Skip to content

Add "diffing" functionality to ensure source data and metadata info is up to date #144

@jh-RLI

Description

@jh-RLI

Description of the issue

A common failure mode in data pipelines is when the actual database schema drifts from the documented metadata (e.g., someone adds a feedin_v2 column to the DB but forgets to document it).

Ideas of solution

If the DB has columns that are missing in the OEMetadata, or if the metadata documents columns that no longer exist in the DB, omi should throw a clear SchemaDriftWarning or fail the build.

There sould also be an update. Assuming you manage your metadata locally with omi and you are using the OEP to manage your data. In this case one should be able to either push or pull changes.

NOTE: since changing someting in the database is not easy pulling changes form the database to the metadata is the doable option here. Pushing changes should possibly only be done with the OEP API.

Workflow checklist

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions