Skip to content

Fix C++ -Werror regressions in llama runner (#19326)#19326

Merged
psiddh merged 1 commit intopytorch:mainfrom
psiddh:export-D103991803
May 6, 2026
Merged

Fix C++ -Werror regressions in llama runner (#19326)#19326
psiddh merged 1 commit intopytorch:mainfrom
psiddh:export-D103991803

Conversation

@psiddh
Copy link
Copy Markdown
Contributor

@psiddh psiddh commented May 6, 2026

Summary:

Fixes 3 -Werror diagnostics that broke the qualcomm llama runner build on
cfg:android-arm64-clang19-no-san and disabled the following test infra
targets:

  • xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib
  • xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib_static
  • xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner
  • xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner_static

Three diagnostics fixed:

  1. -Wreorder-ctor in runner.cpp: attention_sink_rope_module_ is
    declared as the 2nd field of Runner<T> (right after module_) but the
    constructor initializer list appended it last, after tokenizer_. Moved
    it to the correct position in the init list to match declaration order.
    Recent regression introduced in the attention-sink diff (Qualcomm AI Engine Direct - Support attention sink for long context usecase #16574).

  2. -Woverloaded-virtual in lhd_token_generator.h and
    multimodal_lhd_token_generator.h: the derived classes define a
    prepare_io(std::vector<uint64_t>, std::vector<int32_t>) overload that
    hides the base class virtual prepare_io(uint64_t, int64_t). Added a
    using TokenGenerator<T>::prepare_io; (and equivalent for the
    multimodal hierarchy) declaration so the base virtual stays in scope and
    the warning is silenced without changing behavior. Latent bug surfaced
    by the clang19 toolchain bump.

  3. -Wdelete-non-abstract-non-virtual-dtor in prompt_processor.h:
    PromptProcessor<T> has virtual member functions but no virtual
    destructor, so deleting via std::unique_ptr<PromptProcessor<T>> in
    Runner was undefined behavior under strict warnings. Added
    virtual ~PromptProcessor() = default; mirroring the pattern already
    used in TokenGenerator (token_generator.h). Also transitively
    fixes MultimodalPromptProcessor<T>.

Reviewed By: rascani

Differential Revision: D103991803

@psiddh psiddh requested a review from abhinaykukkadapu as a code owner May 6, 2026 07:23
Copilot AI review requested due to automatic review settings May 6, 2026 07:23
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 6, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19326

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 6 Pending, 3 Unrelated Failures

As of commit 2f66064 with merge base 1debeb6 (image):

NEW FAILURE - The following job has failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 6, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented May 6, 2026

@psiddh has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103991803.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@psiddh psiddh requested a review from JacobSzwejbka May 6, 2026 07:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses clang19 -Werror build regressions in the Qualcomm llama runner by fixing constructor member initialization order, correcting virtual function hiding warnings, and ensuring safe polymorphic deletion via a virtual destructor.

Changes:

  • Reordered Runner<T> constructor initializer list to match member declaration order (-Wreorder-ctor).
  • Added using ...::prepare_io declarations in LHD token generator subclasses to avoid hiding base virtuals (-Woverloaded-virtual).
  • Added a virtual destructor to PromptProcessor<T> to avoid UB when deleting through base pointers (-Wdelete-non-abstract-non-virtual-dtor).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
examples/qualcomm/oss_scripts/llama/runner/runner.cpp Fixes member initializer order to satisfy -Wreorder-ctor.
examples/qualcomm/oss_scripts/llama/runner/prompt_processor.h Adds virtual destructor to make polymorphic deletion well-defined.
examples/qualcomm/oss_scripts/llama/runner/lhd_token_generator.h Brings base prepare_io into scope to avoid -Woverloaded-virtual.
examples/qualcomm/oss_scripts/llama/runner/multimodal_runner/multimodal_lhd_token_generator.h Attempts to bring base prepare_io into scope for multimodal LHD (but see review comment re: private access).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@meta-codesync meta-codesync Bot changed the title Fix C++ -Werror regressions in llama runner Fix C++ -Werror regressions in llama runner (#19326) May 6, 2026
psiddh added a commit to psiddh/executorch that referenced this pull request May 6, 2026
Summary:

Fixes 3 `-Werror` diagnostics that broke the qualcomm llama runner build on
`cfg:android-arm64-clang19-no-san` and disabled the following test infra
targets:

- `xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib_static`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner_static`

Three diagnostics fixed:

1. `-Wreorder-ctor` in `runner.cpp`: `attention_sink_rope_module_` is
   declared as the 2nd field of `Runner<T>` (right after `module_`) but the
   constructor initializer list appended it last, after `tokenizer_`. Moved
   it to the correct position in the init list to match declaration order.
   Recent regression introduced in the attention-sink diff (pytorch#16574).

2. `-Woverloaded-virtual` in `lhd_token_generator.h` and
   `multimodal_lhd_token_generator.h`: the derived classes define a
   `prepare_io(std::vector<uint64_t>, std::vector<int32_t>)` overload that
   hides the base class virtual `prepare_io(uint64_t, int64_t)`. Added a
   `using TokenGenerator<T>::prepare_io;` (and equivalent for the
   multimodal hierarchy) declaration so the base virtual stays in scope and
   the warning is silenced without changing behavior. Latent bug surfaced
   by the clang19 toolchain bump.

3. `-Wdelete-non-abstract-non-virtual-dtor` in `prompt_processor.h`:
   `PromptProcessor<T>` has virtual member functions but no virtual
   destructor, so deleting via `std::unique_ptr<PromptProcessor<T>>` in
   `Runner` was undefined behavior under strict warnings. Added
   `virtual ~PromptProcessor() = default;` mirroring the pattern already
   used in `TokenGenerator` (`token_generator.h`). Also transitively
   fixes `MultimodalPromptProcessor<T>`.

Reviewed By: rascani

Differential Revision: D103991803
@psiddh psiddh force-pushed the export-D103991803 branch from be95216 to 024ccb7 Compare May 6, 2026 16:10
Summary:

Fixes 3 `-Werror` diagnostics that broke the qualcomm llama runner build on
`cfg:android-arm64-clang19-no-san` and disabled the following test infra
targets:

- `xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:runner_lib_static`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner`
- `xplat/executorch/examples/qualcomm/oss_scripts/llama:qnn_llama_runner_static`

Three diagnostics fixed:

1. `-Wreorder-ctor` in `runner.cpp`: `attention_sink_rope_module_` is
   declared as the 2nd field of `Runner<T>` (right after `module_`) but the
   constructor initializer list appended it last, after `tokenizer_`. Moved
   it to the correct position in the init list to match declaration order.
   Recent regression introduced in the attention-sink diff (pytorch#16574).

2. `-Woverloaded-virtual` in `lhd_token_generator.h` and
   `multimodal_lhd_token_generator.h`: the derived classes define a
   `prepare_io(std::vector<uint64_t>, std::vector<int32_t>)` overload that
   hides the base class virtual `prepare_io(uint64_t, int64_t)`. Added a
   `using TokenGenerator<T>::prepare_io;` (and equivalent for the
   multimodal hierarchy) declaration so the base virtual stays in scope and
   the warning is silenced without changing behavior. Latent bug surfaced
   by the clang19 toolchain bump.

3. `-Wdelete-non-abstract-non-virtual-dtor` in `prompt_processor.h`:
   `PromptProcessor<T>` has virtual member functions but no virtual
   destructor, so deleting via `std::unique_ptr<PromptProcessor<T>>` in
   `Runner` was undefined behavior under strict warnings. Added
   `virtual ~PromptProcessor() = default;` mirroring the pattern already
   used in `TokenGenerator` (`token_generator.h`). Also transitively
   fixes `MultimodalPromptProcessor<T>`.

Reviewed By: rascani

Differential Revision: D103991803
@psiddh psiddh force-pushed the export-D103991803 branch from 024ccb7 to 2f66064 Compare May 6, 2026 16:27
@psiddh psiddh requested a review from GregoryComer May 6, 2026 17:20
@psiddh psiddh merged commit cdcc915 into pytorch:main May 6, 2026
180 of 188 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants