Skip to content

MINIFICPP-2719 - Add multimodal capability to llama.cpp processor#2107

Open
adamdebreceni wants to merge 9 commits intoapache:mainfrom
adamdebreceni:multimodal_llama
Open

MINIFICPP-2719 - Add multimodal capability to llama.cpp processor#2107
adamdebreceni wants to merge 9 commits intoapache:mainfrom
adamdebreceni:multimodal_llama

Conversation

@adamdebreceni
Copy link
Copy Markdown
Contributor

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file?
  • If applicable, have you updated the NOTICE file?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

@adamdebreceni adamdebreceni marked this pull request as ready for review May 4, 2026 12:18
@lordgamez lordgamez self-requested a review May 5, 2026 11:32
@martinzink martinzink requested review from Copilot and martinzink May 5, 2026 12:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the MiNiFi C++ llama.cpp extension to support multimodal (mtmd) inference, including wiring FlowFile content as “files” into the llama.cpp mtmd pipeline and optionally writing model output to a FlowFile attribute instead of overwriting content.

Changes:

  • Bump vendored llama.cpp to b8944 and apply a new patch to build mtmd support and fix missing includes.
  • Extend RunLlamaCppInference with multimodal model configuration + optional “output to attribute” behavior.
  • Update the LlamaContext interface and DefaultLlamaContext implementation to accept file buffers and perform mtmd tokenization/eval.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
thirdparty/llamacpp/mtmd-fix.patch Adds mtmd subdirectory to llama.cpp build, fixes an include, and removes some mtmd tool executables.
thirdparty/llamacpp/lu8_macro_fix.patch Removes an older llama.cpp patch no longer applied after the version bump.
thirdparty/llamacpp/cpp-23-fixes.patch Removes an older llama.cpp patch no longer applied after the version bump.
cmake/LlamaCpp.cmake Bumps llama.cpp tag, enables LLAMA_BUILD_COMMON, applies mtmd patch, and extends include dirs for common/tools/vendor headers.
extensions/llamacpp/CMakeLists.txt Links the extension against mtmd and llama-common in addition to llama.
extensions/llamacpp/processors/LlamaContext.h Extends generate() to accept a list of binary “files” (e.g., images/audio).
extensions/llamacpp/processors/DefaultLlamaContext.h Adds mtmd/chat-template state and updates constructor/generate signature for multimodal support.
extensions/llamacpp/processors/DefaultLlamaContext.cpp Implements mtmd initialization, multimodal tokenization/eval, and updated decode loop.
extensions/llamacpp/processors/RunLlamaCppInference.h Adds MultiModal Model Path and Output Attribute Name properties and stores them in member state.
extensions/llamacpp/processors/RunLlamaCppInference.cpp Passes FlowFile bytes as multimodal files, inserts mtmd marker, and optionally writes output to an attribute.
extensions/llamacpp/tests/RunLlamaCppInferenceTests.cpp Updates the mock context to match the new generate() signature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +85 to +87
if (multimodal_model_path_) {
input_data_and_prompt.append(mtmd_default_marker());
files.push_back(std::move(read_result));
Comment on lines +144 to +146
if (output_attribute_) {
session.setAttribute(flow_file, output_attribute_.value(), text);
} else {
unique_llama_batch batch;
int32_t decode_status = 0;
if (multimodal_ctx_) {
gsl_Assert(!files.empty());
auto status = mtmd_helper_eval_chunks(multimodal_ctx_, llama_ctx_, chunks.get(), 0, 0, 1, true, &n_past);
if (status != 0) {
throw Exception(PROCESSOR_EXCEPTION, fmt::format("Failed to eval multimodal chunks, error: {}", status));
}
Comment on lines +242 to +246
batch.reset(llama_batch_init(1, 0, 1));
batch->n_tokens = 1;
batch->token[0] = new_token_id;
batch->pos[0] = n_past;
batch->n_seq_id[0] = 1;
Comment on lines 84 to +88
input_data_and_prompt.append("Input data (or flow file content):\n");
input_data_and_prompt.append({reinterpret_cast<const char*>(read_result.data()), read_result.size()});
if (multimodal_model_path_) {
input_data_and_prompt.append(mtmd_default_marker());
files.push_back(std::move(read_result));
} else {
Comment on lines +85 to +88
if (multimodal_model_path_) {
input_data_and_prompt.append(mtmd_default_marker());
files.push_back(std::move(read_result));
} else {
Comment on lines +144 to +148
if (output_attribute_) {
session.setAttribute(flow_file, output_attribute_.value(), text);
} else {
session.writeBuffer(flow_file, text);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants