[refactor](profile) Decouple profile node reports from display tree#63299
[refactor](profile) Decouple profile node reports from display tree#63299foxtail463 wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found blocking rolling-upgrade compatibility issues in the profile-reporting protocol.
Critical checkpoint conclusions:
- Goal and tests: The profile-node typing change is partially covered by a new FE unit test, but mixed-version FE/BE compatibility is not covered and currently regresses profile reporting during rolling upgrades.
- Scope: The core profile struct change is focused, but compatibility handling for the old
fragment_id_to_profilefield and old FE exec-status behavior is missing. - Concurrency: The reviewed changes do not introduce new shared mutable state requiring new lock ordering; existing profile maps remain protected by their existing mutexes/locks.
- Lifecycle/static initialization: No new static/global lifecycle hazards were found in the changed code.
- Configuration: No new configuration items were added.
- Compatibility: Blocking issues found for both old-BE/new-FE and new-BE/old-FE profile reporting during rolling upgrades.
- Parallel paths: Both legacy
fragment_id_to_profileand newfragment_id_to_profile_node_reportsprofile transport paths need to be handled while the old thrift field still exists. - Tests: Missing mixed-version compatibility coverage for profile reporting.
- Observability: Existing logs are sufficient for investigation, but the failures would surface only as dropped/rejected profile updates or status-less exec report handling.
- Transaction/persistence/data correctness: No direct transaction or storage visibility changes found.
- Performance: No blocking performance regression found in the changed paths.
- User focus: No additional user-provided review focus was specified.
Because these issues can break profile reporting or exec-status handling in rolling upgrades, I am requesting changes.
| if (!profile.isSetFragmentIdToProfile()) { | ||
| LOG.warn("{} FragmentIdToProfile is not set", DebugUtil.printId(profile.getQueryId())); | ||
| return new Status(TStatusCode.INVALID_ARGUMENT, "FragmentIdToProfile is not set"); | ||
| if (!profile.isSetFragmentIdToProfileNodeReports()) { |
There was a problem hiding this comment.
This makes a new FE reject profile reports from old BEs during a rolling upgrade. The thrift struct still has the legacy fragment_id_to_profile field, and old BEs will continue to send only that field until they are upgraded; with this guard, updateProfile() returns INVALID_ARGUMENT and drops those profiles instead of translating the old TDetailedReportParams format. Please keep a fallback for fragment_id_to_profile until mixed-version reporting is no longer supported.
There was a problem hiding this comment.
The impact is limited to profile visibility during rolling
upgrade. Query execution status, load progress and correctness are not affected.
Since this report path is only for profile collection and the new structured field is
the intended format after decoupling profile node reports from the display tree, I
would prefer not to add legacy fallback
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Problem Summary:
BE previously reported pipeline profiles as an ordered list, and FE inferred the
meaning of each profile node from its position in that list. The first element
was treated as the fragment-level profile, while the remaining elements were
treated as pipeline profiles.
This couples the profile transport protocol to the display tree layout and makes
the logic fragile when the profile tree structure changes. It also forces FE to
recover semantic information from implicit ordering instead of using explicit
metadata.
This change adds structured profile node reports with explicit node type and
pipeline id. BE now reports fragment-level and pipeline-level profiles with
their semantic type, and FE builds the RuntimeProfile display tree from that
structured data. The aggregated multi-BE pipeline profile only consumes pipeline
reports.