Skip to content

Replace MinIO with LocalStack as an S3 storage option #12394

Open
srmanda-cs wants to merge 12 commits into
IQSS:developfrom
uncch-rdmc:12036-remove-minio-as-s3-storage-option
Open

Replace MinIO with LocalStack as an S3 storage option #12394
srmanda-cs wants to merge 12 commits into
IQSS:developfrom
uncch-rdmc:12036-remove-minio-as-s3-storage-option

Conversation

@srmanda-cs
Copy link
Copy Markdown

@srmanda-cs srmanda-cs commented May 13, 2026

What this PR does / why we need it:
Removes MinIO and replaces MinIO tests with LocalStack.

Previously since LocalStack did not offer good pre-signed URL capabilities, the Dataverse team was forced to go with MinIO. However, since that decision was made LocalStack has improved significantly in capabilities. So, this PR replaces the previous MinIO instantiation with a localstack_noredirect instantiation which performs the same no redirect downloads with LocalStack instead. Changes were made in dataverse-ansible to make sure localstack_noredirect is present in the testing environment. After this PR is merged, changes have to be made again in dataverse-ansible to remove MinIO and stay true to Dataverse's open-source roots. The dev_minio containers were removed from both the docker-compose-dev.yml files to reflect these changes into containers as well.

Which issue(s) this PR closes:

Special notes for your reviewer:
Could be a breaking change for people running on containers.

Suggestions on how to test this:

  1. Containers should be launching and working as usual along with containerized testing working as usual.
  2. The S3AccessIT API test needs to be able to pass on Jenkins

If these two are satisfied then, the PR should be good to go. However, Phil, Oliver, and others might have a lot of things to say about the messy way I go about things, so everything they have to add too.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
N/A

Is there a release notes update needed for this change?:
Yes. Probably. Oliver would know better than me.

Additional documentation:
Absolutely needed. Again, Oliver would know better than me.

Copilot AI review requested due to automatic review settings May 13, 2026 15:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the MinIO-based local S3 option from the development docker-compose setup and updates the S3 API integration test to rely solely on LocalStack, aligning with the goal of discouraging MinIO usage.

Changes:

  • Updated S3AccessIT to remove MinIO client usage and run S3 assertions against LocalStack only.
  • Removed MinIO storage-driver JVM properties and the MinIO service from the dev docker-compose configurations.
  • Kept LocalStack as the sole local S3-compatible backend in the provided dev environments.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/test/java/edu/harvard/iq/dataverse/api/S3AccessIT.java Drops MinIO usage and rewires S3 integration assertions to LocalStack.
docker-compose-dev.yml Removes MinIO driver configuration and the MinIO container from the dev stack.
conf/keycloak/docker-compose-dev.yml Removes MinIO driver configuration and the MinIO container from the Keycloak dev stack.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/test/java/edu/harvard/iq/dataverse/api/S3AccessIT.java
Comment thread docker-compose-dev.yml
Comment on lines 33 to 46
JVM_ARGS: -Ddataverse.files.storage-driver-id=file1
-Ddataverse.files.file1.type=file
-Ddataverse.files.file1.label=Filesystem
-Ddataverse.files.file1.directory=${STORAGE_DIR}/store
-Ddataverse.files.localstack1.type=s3
-Ddataverse.files.localstack1.label=LocalStack
-Ddataverse.files.localstack1.custom-endpoint-url=http://localstack:4566
-Ddataverse.files.localstack1.custom-endpoint-region=us-east-2
-Ddataverse.files.localstack1.bucket-name=mybucket
-Ddataverse.files.localstack1.path-style-access=true
-Ddataverse.files.localstack1.upload-redirect=true
-Ddataverse.files.localstack1.download-redirect=true
-Ddataverse.files.localstack1.access-key=default
-Ddataverse.files.localstack1.secret-key=default
Copy link
Copy Markdown
Author

@srmanda-cs srmanda-cs May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oliver will need to guide me here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srmanda-cs those earlier release notes are historic, and may be ignored. I do see references to MinIO in doc/sphinx-guides/source/installation/config.rst and doc/sphinx-guides/source/installation/big-data-support.rst — if you like I can submit a PR to your branch to update the docs?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@donsizemore I appreciate the help, I'll update the docs now, could you let me know if they look good? Thank you!

@srmanda-cs srmanda-cs marked this pull request as draft May 13, 2026 15:53
…load

Switch testNonDirectUpload from localstack1 (upload-redirect=true,
download-redirect=true) to the new localstack_noredirect driver
(both redirects disabled), so the test genuinely exercises the
non-redirect proxy-through-Dataverse code path.

Also replace the plain downloadFile call with downloadFileNoRedirect
and assert statusCode(200). This makes the assertion self-documenting:
a 303 response would now cause an explicit test failure instead of
being silently followed by RestAssured.
@srmanda-cs srmanda-cs marked this pull request as ready for review May 13, 2026 16:08
@srmanda-cs
Copy link
Copy Markdown
Author

srmanda-cs commented May 13, 2026

Failing tests addressed in: gdcc/dataverse-ansible#429 and in a commit that was made to develop immediately afterwards.

…alstack_noredirect

Using the same bucket name as localstack1 would cause a collision in
the test environment when tasks/localstack_create_bucket.yml runs
aws s3 mb on each bucket entry. Use mybucket-noredirect to avoid
this. Update driver configs in both docker-compose files and switch
S3AccessIT.testNonDirectUpload to use the new BUCKET_NAME_NOREDIRECT
constant.
@srmanda-cs
Copy link
Copy Markdown
Author

@pdurbin pdurbin moved this to Ready for Triage in IQSS Dataverse Project May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for Triage

Development

Successfully merging this pull request may close these issues.

Remove/Discourage using MinIO as a S3 Storage Option

4 participants