PS-11001: Enable purge_binlogs on the s3 storage backend#136
PS-11001: Enable purge_binlogs on the s3 storage backend#136kamil-holubicki wants to merge 3 commits into
Conversation
https://perconadev.atlassian.net/browse/PS-11001 Added a new 'binlog_server list <json_config_file>' subcommand that enumerates every binlog file currently present in the configured storage and prints them, in chronological (storage) order, as a JSON document on stdout. The per-record schema matches the existing 'search_by_timestamp' / 'search_by_gtid_set' responses (name, size, uri, previous_gtids, added_gtids, min_timestamp, max_timestamp) and the 'binsrv::models::search_response' class is reused as-is. Unlike the search subcommands, 'list' returns 'status:success' with an empty 'result' array on an empty storage instead of raising an error. This makes 'list' the primary diagnostic command operators can use to distinguish a freshly initialized (empty) storage from a corrupted one. Co-authored-by: Cursor <cursoragent@cursor.com>
| } | ||
| if (::fsync(file_descriptor) != 0) { | ||
| const auto saved_errno = errno; | ||
| ::close(file_descriptor); |
There was a problem hiding this comment.
May be boost::scoped_exit to avoid calling ::close() from 2 places
| util::exception_location().raise<std::runtime_error>( | ||
| "cannot close underlying tmp object file"); | ||
| } | ||
|
|
There was a problem hiding this comment.
May be we could also call fsync() for file data as well here?
| void remove_object(std::string_view name); | ||
| // Batch remove: drops each named object, then runs a single | ||
| // durability barrier for the whole batch. | ||
| void remove_objects(std::span<const std::string> names); |
There was a problem hiding this comment.
I don't like the asymmetry here (remove_object() accepts std::string_view, whereas remove_objects() expects std::strings)
Ideally, it would be great to have something like
void remove_objects(std::span<const std::string_view> names);but I understand that it will be very inconvenient to use (even for our use case)
| for (const auto &name : names) { | ||
| try { | ||
| do_remove_object(name); | ||
| } catch (...) { // NOLINT(bugprone-empty-catch) | ||
| // best-effort: per-object failures are intentionally swallowed | ||
| // (see 'remove_objects' contract in the header) | ||
| } | ||
| } | ||
| do_fsync(); |
There was a problem hiding this comment.
May be do our best trying to remove individual objects but still throw if at least one of the deletes failed.
| for (const auto &name : names) { | |
| try { | |
| do_remove_object(name); | |
| } catch (...) { // NOLINT(bugprone-empty-catch) | |
| // best-effort: per-object failures are intentionally swallowed | |
| // (see 'remove_objects' contract in the header) | |
| } | |
| } | |
| do_fsync(); | |
| bool error_occurred{false}; | |
| for (const auto &name : names) { | |
| try { | |
| do_remove_object(name); | |
| } catch (...) { | |
| // best-effort: per-object failures are intentionally swallowed | |
| // (see 'remove_objects' contract in the header) | |
| error_occurred = true; | |
| } | |
| } | |
| do_fsync(); | |
| if (error_occurred) { | |
| throw ...; | |
| } |
There was a problem hiding this comment.
I have changed it so that the first error bubbles up to the caller and is included in the JSON response as a warning. If needed in the future, we can collect files + failures and bubble them up, so the JSON response could contain all failed files and the exact reasons.
|
|
||
| void s3_storage_backend::aws_context::delete_object( | ||
| const qualified_object_path &target) const { | ||
| Aws::S3Crt::Model::DeleteObjectRequest delete_object_request; |
There was a problem hiding this comment.
Just add a // TODO: comment here
Consider deleting several objects in one request by using `DeleteObjects()` AWS SDK C++ API call
Add 'binlog_server purge_binlogs <config> <binlog_name>' which removes every binlog file with sequence number less than or equal to <binlog_name> (a contiguous prefix). The current tail is refused: emptying the storage would lose the resume position and force the next 'fetch' / 'pull' to re-stream from the beginning of the source's retained binlog history. Phase 1 only: 'file' backend; v2 will extend to 's3' once 'remove_object' is wired to 'DeleteObject'. Self-healing recovery from a purge crashed between the index rewrite and the payload/metadata deletion is left for later phases - existing constructor validators refuse to open such a storage and surface it for manual cleanup. Co-authored-by: Cursor <cursoragent@cursor.com>
0cc5afa to
c10248e
Compare
c10248e to
0990dd3
Compare
No description provided.