Skip to content

Conversation

@mehiel
Copy link

@mehiel mehiel commented Dec 15, 2025

The default on_conflict: :nothing behavior uses ON DUPLICATE KEY UPDATE x = x
which has the following characteristics:

  • Reports 1 row affected even when a row was skipped due to a duplicate key
    conflict, because the UPDATE clause still matches the existing row
  • Returns {1, nil} instead of {0, nil} for ignored duplicates
  • The UPDATE clause could trigger unnecessary row-level operations

For users who need accurate row counts (0 when ignored, 1 when inserted),
INSERT IGNORE can be explicitly enabled via:

Repo.insert_all(Post, posts,
  on_conflict: :nothing,
  conflict_target: {:unsafe_fragment, "insert_ignore"})

With INSERT IGNORE:

  • Properly ignores duplicate key conflicts without affecting existing rows
  • Returns accurate affected row count (0 for duplicates)
  • The adapter handles num_rows: 0 by returning {:ok, []} since 0 rows is
    expected behavior when all rows are duplicates

This approach was chosen to avoid modifying Ecto core with a new on_conflict
type like :ignore for MySQL-specific intricacies. By leveraging the existing
{:unsafe_fragment, _} mechanism, MySQL users can opt into INSERT IGNORE
semantics when needed while maintaining backward compatibility.

Note that INSERT IGNORE has broader semantics in MySQL - it ignores certain
type conversion errors in addition to duplicate key conflicts - which is why
it's not the default behavior.

ORIGINAL POST

The previous implementation used ON DUPLICATE KEY UPDATE col = col which had incorrect semantics:

  1. It reported 1 row affected even when a row was skipped due to a duplicate key conflict, because the UPDATE clause still matched the existing row
  2. This caused insert_all to return {1, nil} instead of {0, nil} for ignored duplicates, misrepresenting the actual number of inserted records
  3. The UPDATE clause could trigger unnecessary row-level operations

This change:

  • Uses INSERT IGNORE INTO which properly ignores duplicate key conflicts without affecting existing rows or incrementing the affected row count
  • Handles num_rows: 0 in the adapter by returning {:ok, []} for on_conflict: :nothing, since 0 rows is expected behavior when all rows are duplicates
  • Updates tests to verify correct row counts: {0, nil} for all-duplicates, {N, nil} for N successfully inserted non-duplicate rows

The previous implementation used `ON DUPLICATE KEY UPDATE col = col` which
had incorrect semantics:

1. It reported 1 row affected even when a row was skipped due to a duplicate
   key conflict, because the UPDATE clause still matched the existing row
2. This caused insert_all to return {1, nil} instead of {0, nil} for ignored
   duplicates, misrepresenting the actual number of inserted records
3. The UPDATE clause could trigger unnecessary row-level operations

This change:

- Uses `INSERT IGNORE INTO` which properly ignores duplicate key conflicts
  without affecting existing rows or incrementing the affected row count
- Handles num_rows: 0 in the adapter by returning {:ok, []} for
  on_conflict: :nothing, since 0 rows is expected behavior when all rows
  are duplicates
- Updates tests to verify correct row counts: {0, nil} for all-duplicates,
  {N, nil} for N successfully inserted non-duplicate rows
@mehiel
Copy link
Author

mehiel commented Dec 15, 2025

I guess this means at least a PATCH version. This should be considered a breaking change. Or we can somehow put it behind a configuration with the default being the existing behavior.

@greg-rychlewski
Copy link
Member

I don't believe it is as simple as this. Please take a read here: https://dev.mysql.com/doc/refman/8.4/en/sql-mode.html#ignore-effect-on-execution

For example this part:

INSERT: With IGNORE, rows that duplicate an existing row on a unique key value are discarded. Rows set to values that would cause data conversion errors are set to the closest valid values instead.

For partitioned tables where no partition matching a given value is found, IGNORE causes the insert operation to fail silently for rows containing the unmatched value.

The issue seems to be that MySQL doesn't have any equivalent to conflict_target in Postgres. So now you have to choose between two suboptimal methods. The current one will at least still cause an error if an error unrelated to the conflict target happens.

@mehiel
Copy link
Author

mehiel commented Dec 16, 2025

Yes, I read the docs but for some reason the data conversion part didn't trigger. There are 3 parts in this section:

  1. ✔️ rows that duplicate an existing row on a unique key value are discarded - main reason to use it here
  2. rows set to values that would cause data conversion errors are set to the closest valid values instead - this is sneaky
  3. ✔️-ish for partitioned tables where no partition matching a given value is found, IGNORE causes the insert operation to fail silently - acceptable with on_conflict: :nothing

So if we have to choose our poison here we're indeed between 2 suboptimal solutions when we provide on_conflict: :nothing

  1. accept wrong num_rows - we even have a unit test adding the same post twice and getting back num_rows 1 on the second insert
  2. accept automatic type conversions - happens when we set on_conflict: :nothing and there is no conflict on a unique key and a bad query finds its way eg. inserting 'abc' to an integer column will make it with value 0 (ecto may prevent some of these but still)

I think preventing 2 is more important for a core piece like ecto_sql. I would love to have a way to configure it even on the query level because invalid num_rows is also a huge pain and usually related with use cases that need on_conflict: :nothing in the first place.

Thanks for pointing that out @greg-rychlewski.

The default on_conflict: :nothing behavior now uses the ON DUPLICATE KEY
UPDATE x = x workaround (restoring original behavior), which always reports
1 affected row regardless of whether the row was inserted or ignored.

For users who need accurate row counts (0 when ignored, 1 when inserted),
INSERT IGNORE can be explicitly enabled via:

    Repo.insert_all(Post, posts,
      on_conflict: :nothing,
      conflict_target: {:unsafe_fragment, "insert_ignore"})

This approach was chosen to avoid modifying Ecto core with a new on_conflict
type like :ignore for MySQL-specific intricacies. By leveraging the existing
{:unsafe_fragment, _} mechanism, MySQL users can opt into INSERT IGNORE
semantics when needed while maintaining backward compatibility.

Note that INSERT IGNORE has broader semantics in MySQL - it ignores certain
type conversion errors in addition to duplicate key conflicts - which is why
it's not the default behavior.
@mehiel mehiel changed the title MyXQL: Use INSERT IGNORE for on_conflict: :nothing MyXQL: explicit option to use INSERT IGNORE for on_conflict: :nothing, confict_target: {:unsafe_fragment, "insert_ignore"}} Dec 16, 2025
@mehiel mehiel changed the title MyXQL: explicit option to use INSERT IGNORE for on_conflict: :nothing, confict_target: {:unsafe_fragment, "insert_ignore"}} MyXQL: allow use of INSERT IGNORE for on_conflict: :nothing Dec 16, 2025
@mehiel
Copy link
Author

mehiel commented Dec 16, 2025

@greg-rychlewski did a last try to make this opt-in under a certain conflict_target value. This is the only option that allowed me to do this without changing Ecto internals.

I understand it looks like a magic value but check this out and let me know what you think.
It's completely fine to not like it 😄

@josevalim
Copy link
Member

Yeah, :unsafe_fragment is meant to literally be a string that we insert, so I am a bit skeptical about special casing it. Perhaps the best way is to have a separate option that only MySQL handles on its insert, something like: insert_mode: :ignore_errors and then we document that it also affects the behaviour of ON CONFLICT NOTHING.

@mehiel
Copy link
Author

mehiel commented Dec 16, 2025

Perhaps the best way is to have a separate option that only MySQL handles on its insert, something like: insert_mode: :ignore_errors

Yeah, tried that but if I read this correctly it requires change to the Connection behavior since it doesn't take opts there. I'll give it another try.

@josevalim
Copy link
Member

You can probably add a default arguments opts \\ [] to MyXQL implementation only and invoke it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants