Skip to content

[BUG] CdmChangeFeedSource attempts to stream data from multiple locations #98

@george-zubrienko

Description

@george-zubrienko

Description

It seems sometimes in Dynamics data lake (Synapse) export the export location might change. However, due to the fact that we search for table using this code:

var tableBlobs = this.source.blobStorage.ListBlobsAsEnumerable(this.tablesPath).Where(blob =>
    blob.Name.Split("/")[^1].StartsWith($"{this.source.entityName.ToUpper()}_") &&
    blob.Name.EndsWith(".csv")).ToList();

and that filenames are not changed, we can hit two different paths in this case - old one with schemaA and new one with schemaB. Thus, stream will run normally and the abort with data type mismatch error when all rows from schemaA are exhausted

Steps to reproduce the issue

  1. Create two different paths leading to the table with the same name, but different schemas
  2. Run the source
  3. Observe failure after rows from first path are exhausted

Describe the results you expected

Either a hard failure with an ERROR level message, or a warning and automatic selection of files from a newer path (preferred)

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    code/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions