Skip to content

Conversation

@Harshit28j
Copy link

@Harshit28j Harshit28j commented Dec 23, 2025

Fixes #3513

Summary

Skip uploading files when content hasn't changed by comparing CRC32C hashes.

Changes

  • Added _calculate_crc32c() helper function
  • Modified upload_single() to check existing blob hash before uploading

How it works

  1. Get existing blob from GCS
  2. Calculate local file CRC32C hash
  3. Compare with remote hash
  4. Skip upload if hashes match
image

@Harshit28j
Copy link
Author

Harshit28j commented Dec 23, 2025

@another-rex Would really appreciate a review on this PR when you get a chance! 🙂

Copy link
Contributor

@another-rex another-rex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Though we have mostly moved to a go version of the exporter, though this looks fine to merge in as well.

import zipfile
from typing import List

import google_crc32c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this already part of the dependencies? I think this should be added to pyproject.toml as a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exporter should check the crc hash before uploading

2 participants