Skip to content

DataFrame Merge method throws exception when one of the DataFrames is empty. #7572

@sevenzees

Description

@sevenzees

System Information (please complete the following information):

  • OS & Version: All
  • ML.NET Version: 5.0.0
  • .NET Version: 8

Describe the bug
When using the DataFrame Merge method, if one of the DataFrames is empty (has no rows), you will get a System.ArgumentOutOfRangeException.

To Reproduce
This code will reproduce the behavior:
DataFrame left = new DataFrame(
new Int32DataFrameColumn("Index"),
new Int32DataFrameColumn("L1"),
new Int32DataFrameColumn("L2"),
new StringDataFrameColumn("L3")
);
DataFrame right = new DataFrame(
new Int32DataFrameColumn("Index", new[] { 0, 1, 2 }),
new Int32DataFrameColumn("R1", new[] { 0, 1, 1 }),
new Int32DataFrameColumn("R2", new[] { 1, 1, 2 }),
new StringDataFrameColumn("R3", new[] { "Z", "Y", "B" })
);
DataFrame merged = left.Merge(right, ["L1"], ["R1"], joinAlgorithm: JoinAlgorithm.Left);

Expected behavior
Expected to get this DataFrame instead of an ArgumentOutOfRangeException:
DataFrame expectedResult = new DataFrame(
new Int32DataFrameColumn("Index_left"),
new Int32DataFrameColumn("L1"),
new Int32DataFrameColumn("L2"),
new StringDataFrameColumn("L3"),
new Int32DataFrameColumn("Index_right"),
new Int32DataFrameColumn("R1"),
new Int32DataFrameColumn("R2"),
new StringDataFrameColumn("R3")
);

Screenshots, Code, Sample Projects
N/A

Additional context
I have code ready to fix this and will create a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedNew issue has not been triaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions