Add an overwrite boolean when modifying repo content#7551
Add an overwrite boolean when modifying repo content#7551daviddavis wants to merge 1 commit intopulp:mainfrom
Conversation
Add a `overwrite` boolean parameter to endpoints which modify repository content. When set to false, the update checks whether any content being added would overwrite existing content (based on repo_key_fields) and returns a 409 Conflict response with details instead of silently replacing it. fixes pulp#7550
gerrod3
left a comment
There was a problem hiding this comment.
I just added the same feature (in a slightly different manner) to pulp-python to prevent the same problem. pulp/pulp_python#1163
One of things I was concerned about was the performance of trying to validate the modify call, at request time, will not produce an overwrite against the potential thousands of units a user can call with. Have you tested this against your own instance? How large are the modify calls you typically use?
| continue | ||
|
|
||
| pulp_type = type_obj.get_pulp_type() | ||
| incoming_qs = type_obj.objects.filter(pk__in=add_content.filter(pulp_type=pulp_type)) |
There was a problem hiding this comment.
I know we have this pattern in our repo-version utils file, but I wonder if it is more performant than a normal type_obj.objects.filter(pk__in=add_content). Theoretically it should filter down the number of content to check to only those of the correct type, but does it actually increase the speed?
| } | ||
| incoming_by_key = { | ||
| tuple(row[f] for f in repo_key_fields): row["pk"] | ||
| for row in incoming_qs.filter(find_dup_qs).values("pk", *repo_key_fields) |
There was a problem hiding this comment.
We can remove this query if we add pk to the batch_qs(incoming_qs.values(*repo_key_fields) at the top, then create this dictionary during the assembling of the find_dup_qs.
|
@gerrod3 very interesting. I haven't tested this in our instance yet--we actually validate requests and check them for conflicts/substitutions before passing them to Pulp. We do this using the Pulp API so as you can imagine it's slow and not a great solution. Performance has been an issue with our current solution but luckily most of our users are only adding one or two packages to a repo at a time. I think adding 30 packages is the biggest request size we see. Your solution looks really clean. I wonder if it could be moved into pulpcore? The one issue though that would prevent us from being able to adopt it is that our repos are shared by multiple teams/users. So if two users are adding packages to a repo around the same timeframe, one user might want to overwrite packages while the other might not. Perhaps one solution might be to also have a allow_package_substitution parameter for repo modify and package upload that override the repo field? |
Add a
overwriteboolean parameter to endpoints which modify repository content. When set to false, the update checks whether any content being added would overwrite existing content (based on repo_key_fields) and returns a 409 Conflict response with details instead of silently replacing it.fixes #7550