ardoco · dfuchss · Feb 19, 2026 · Nov 28, 2025 · Nov 28, 2025 · Nov 28, 2025
@@ -33,3 +33,33 @@ Runs the pipeline in transitive mode and evaluates it. This is useful for multi-
 java -jar ./ratlr.jar transitive -c ./configs/d2m.json ./configs/m2c.json -e ./configs/eval.json
 ```
 
+## Prompt Optimization
+
+Optimizes prompts used in trace link classification to improve performance.
+This command runs the prompt optimization pipeline and optionally evaluates the optimized prompts against evaluation configurations.
+
+The optimization process:
+1. Runs baseline evaluation (if evaluation configs are provided)
+2. Executes the prompt optimizer with the specified optimization configuration
+3. Re-runs evaluation with the optimized prompt to measure improvement
+
+As only the optimized prompt is transferred from the optimization results to the evaluation, other configuration parameters (e.g., model, dataset) do not have to match between optimization and evaluation configurations.
+
+### Examples
+
+```bash
+# Run optimization with a single config
+java -jar ./ratlr.jar optimize -c ./example-configs/optimizer-config.json
+
+# Run optimization and evaluate the results
+java -jar ./ratlr.jar optimize -c ./example-configs/optimizer-config.json -e ./example-configs/simple-config.json
+
+# Run optimization with directories
+java -jar ./ratlr.jar optimize -c ./configs/optimization -e ./configs/evaluation
+```
+
+### Options
+
+- `-c, --configs`: **(Required)** One or more optimization configuration file paths. If a path points to a directory, all files within that directory will be processed.
+- `-e, --eval`: **(Optional)** One or more evaluation configuration file paths. Each evaluation configuration will be used with each optimization config to measure performance before and after optimization.
+
@@ -0,0 +1,107 @@
+# Prompt Optimization
+
+## Overview
+
+Prompt optimization in LiSSA-RATLR enables the automatic systematic refinement of prompts used for traceability link recovery.
+By leveraging various optimization strategies and evaluation metrics, the effectiveness of prompts may be increased, leading to improved classification accuracy and overall performance.
+This also enables us to quantify the importance of well designed prompts in the context of traceability link recovery.
+
+## Core Components
+
+### Prompt Metrics (`promptmetric` package)
+
+A [`Metric`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/Metric.java) is a numeric measure used to evaluate the quality of prompts during the optimization process.
+They are used to guide the optimization by providing feedback on how well a prompt performs in generating accurate traceability links.
+Currently, they are divided into two types of metrics.
+Global metrics evaluate the prompt's performance across the entire test dataset.
+Pointwise metrics scores the performance of prompts on individual data points and reduces the results into a single numeric performance value.
+If a pointwise metric is used, different scoring and reduction strategies can be configured and combined as desired.
+
+Custom metrics can be added either through implementation of the [`Global Metrics`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/GlobalMetric.java) abstract class or through implementing new scoring and reduction strategies for pointwise metrics.
+
+#### Available Metrics
+
+- **[`Global Metrics`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/GlobalMetric.java)**:
+  - **F_Beta-Score** (`fBeta` or `f1`)
+- **[`Pointwise Metrics`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/PointwiseMetric.java)** (`pointwise`):
+  - Scoring Strategies:
+    - Binary Scorer (Correct Classification / Incorrect Classification)
+  - Reduction Strategies:
+    - Mean
+- **[`Mock Metric`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/promptmetric/MockMetric.java)** (`mock`): Returns dummy values for testing purposes
+
+### Optimizers (`promptoptimizer` package)
+
+The [`Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/PromptOptimizer.java) module handles prompt optimization requests.
+Different optimization strategies are implemented to improve prompts using various means.
+Optimization approaches will usually utilize an iterative process.
+Prompts are refined over multiple iterations based on the feedback provided through the selected prompt metric.
+They are highly configurable with the optimization configuration file.
+
+Prompt optimizers utilize the usual stages of the evaluation pipeline as well.
+They utilize LiSSA's caching mechanism to provide consistent and reproducible results across different runs.
+
+Custom optimizers can be added by implementing the [`Prompt Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/PromptOptimizer.java) interface.
+
+#### Available Optimizers
+
+- **[`Naive Iterative Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/IterativeOptimizer.java)** (`iterative` or `simple`):
+  The most basic optimizer that makes changes to the prompt in each iteration.
+  It simply queries the large language model to improve the current prompt using an optimization prompt.
+  The new prompt is naively carried over to the next iteration without any further checks.
+  - `simple`: Defaults to one (1) iteration
+  - `iterative`: Defaults to five (5) iterations
+- **[`Feedback-Based Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/IterativeFeedbackOptimizer.java)** (`feedback`):
+  The iterative feedback optimizer improves prompts by leveraging feedback from the large language model.
+  In each iteration, it queries the model with an additional feedback text on the current prompt.
+  The optimizer carries the optimized prompt to the next iteration naively.
+  Trace links that were incorrectly classified in previous iterations are highlighted in the feedback text to guide the model towards better performance.
+- **[`Mock Optimizer`](../src/main/java/edu/kit/kastel/sdq/lissa/ratlr/promptoptimizer/MockOptimizer.java)** (`mock`): Returns dummy optimized prompts for testing purposes
+
+## Configuration
+
+### Optimization Configuration Structure
+
+Modules of the evaluation configuration file will also need to be configured in the optimization configuration file.
+This excerpt shows the additional configuration options specific to prompt optimization.
+
+```json
+
+{
+  [...]
+  "metric" : {
+    "name" : "mock",
+    "args" : {}
+  },
+  "prompt_optimizer": {
+    "name" : "simple_openai",
+    "args" : {
+      "prompt": "Question: Here are two parts of software development artifacts.\n\n            {source_type}: '''{source_content}'''\n\n            {target_type}: '''{target_content}'''\n            Are they related?\n\n            Answer with 'yes' or 'no'.",
+      "model": "gpt-4o-mini-2024-07-18"
+    }
+  }
+}
+
+```
+
+To see detailed configurable fields for any of the modules refer to a prompt optimization result file.
+After executing a minimal configuration the resulting file will contain the full configuration with all default values filled in.
+
+## Usage
+
+Refer to the [CLI Documentation](cli.md#prompt-optimization) for instructions on how to run prompt optimization using the command line interface.
+
+### Optimization Process
+
+The optimization process generally follows these steps:
+
+1. **Baseline Evaluation (Optional)**: If evaluation configurations are provided, the baseline performance of the original prompt is measured.
+2. **Prompt Optimization**: The prompt optimizer is executed using the specified optimization configuration. The prompt is refined iteratively based on the selected metric.
+3. **Post-Optimization Evaluation (Optional)**: If evaluation configurations are provided, the optimized prompt is evaluated to measure differences over the baseline.
+
+## Output and Results
+
+### Result Files
+
+The prompt optimization results will be stored as `results-prompt-optimization-<config_filename>.md` just as regular evaluation results.
+They include the full configuration used for optimization as well as the optimized prompt.
@@ -0,0 +1,73 @@
+
+{
+  "cache_dir": "./cache/WARC",
+
+  "gold_standard_configuration": {
+    "path": "./datasets/req2req/WARC/answer.csv",
+    "hasHeader": "true"
+  },
+
+  "source_artifact_provider" : {
+    "name" : "text",
+    "args" : {
+      "artifact_type" : "requirement",
+      "path" : "./datasets/req2req/WARC/high"
+    }
+  },
+  "target_artifact_provider" : {
+    "name" : "text",
+    "args" : {
+      "artifact_type" : "requirement",
+      "path" : "./datasets/req2req/WARC/low"
+    }
+  },
+  "source_preprocessor" : {
+    "name" : "artifact",
+    "args" : {}
+  },
+  "target_preprocessor" : {
+    "name" : "artifact",
+    "args" : {}
+  },
+  "embedding_creator" : {
+    "name" : "openai",
+    "args" : {
+      "model": "text-embedding-3-large"
+    }
+  },
+  "source_store" : {
+    "name" : "custom",
+    "args" : {}
+  },
+  "target_store" : {
+    "name" : "cosine_similarity",
+    "args" : {
+      "max_results" : "4"
+    }
+  },
+  "metric" : {
+    "name" : "mock",
+    "args" : {}
+  },
+  "prompt_optimizer": {
+    "name" : "simple_openai",
+    "args" : {
+      "prompt": "Question: Here are two parts of software development artifacts.\n\n            {source_type}: '''{source_content}'''\n\n            {target_type}: '''{target_content}'''\n            Are they related?\n\n            Answer with 'yes' or 'no'.",
+      "model": "gpt-4o-mini-2024-07-18"
+    }
+  },
+  "classifier" : {
+    "name" : "simple_openai",
+    "args" : {
+      "model": "gpt-4o-mini-2024-07-18"
+    }
+  },
+  "result_aggregator" : {
+    "name" : "any_connection",
+    "args" : {}
+  },
+  "tracelinkid_postprocessor" : {
+    "name" : "identity",
+    "args" : {}
+  }
+}
@@ -21,7 +21,7 @@
     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
     <picocli.version>4.7.7</picocli.version>
     <record-builder.version>52</record-builder.version>
-    <metrics.version>0.2.0</metrics.version>
+    <metrics.version>0.2.1</metrics.version>
   </properties>
 
   <dependencyManagement>

@@ -1,9 +1,10 @@
-/* Licensed under MIT 2025. */
+/* Licensed under MIT 2025-2026. */
 package edu.kit.kastel.sdq.lissa.cli;
 
 import java.nio.file.Path;
 
 import edu.kit.kastel.sdq.lissa.cli.command.EvaluateCommand;
+import edu.kit.kastel.sdq.lissa.cli.command.OptimizeCommand;
 import edu.kit.kastel.sdq.lissa.cli.command.TransitiveTraceCommand;
 
 import picocli.CommandLine;
@@ -15,12 +16,13 @@
  * <ul>
  *     <li>{@link EvaluateCommand} - Evaluates trace link analysis configurations</li>
  *     <li>{@link TransitiveTraceCommand} - Performs transitive trace link analysis</li>
+ *     <li>{@link OptimizeCommand} - Optimize a single prompt for better trace link analysis classification results</li>
  * </ul>
  *
  * The CLI supports various command-line options and provides help information
  * through the standard help options (--help, -h).
  */
-@CommandLine.Command(subcommands = {EvaluateCommand.class, TransitiveTraceCommand.class})
+@CommandLine.Command(subcommands = {EvaluateCommand.class, TransitiveTraceCommand.class, OptimizeCommand.class})
 public final class MainCLI {
 
     /**

@@ -0,0 +1,126 @@
+/* Licensed under MIT 2025-2026. */
+package edu.kit.kastel.sdq.lissa.cli.command;
+
+import static edu.kit.kastel.sdq.lissa.cli.command.EvaluateCommand.loadConfigs;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import edu.kit.kastel.sdq.lissa.ratlr.Evaluation;
+import edu.kit.kastel.sdq.lissa.ratlr.Optimization;
+
+import picocli.CommandLine;
+
+/**
+ * Command implementation for optimizing prompts used in trace link analysis configurations.
+ * This command processes one or more optimization configuration files to run the prompt
+ * optimization pipeline, and optionally evaluates the optimized prompts using specified
+ * evaluation configuration files.
+ */
+@CommandLine.Command(
+        name = "optimize",
+        mixinStandardHelpOptions = true,
+        description = "Optimizes a prompt for usage in the pipeline")
+public class OptimizeCommand implements Runnable {
+
+    private static final Logger logger = LoggerFactory.getLogger(OptimizeCommand.class);
+
+    /**
+     * Array of optimization configuration file paths to be processed.
+     * If a path points to a directory, all files within that directory will be processed.
+     * This option is required to run the optimization command.
+     */
+    @CommandLine.Option(
+            names = {"-c", "--configs"},
+            arity = "1..*",
+            description =
+                    "Specifies one or more config paths to be invoked by the pipeline iteratively. If the path points "
+                            + "to a directory, all files inside are chosen to get invoked.")
+    private Path[] optimizationConfigs;
+
+    /**
+     * Array of evaluation configuration file paths to be processed.
+     * If a path points to a directory, all files within that directory will be processed.
+     * This option is optional; if not provided, no evaluation will be performed after optimization.
+     */
+    @CommandLine.Option(
+            names = {"-e", "--eval"},
+            arity = "0..*",
+            description = "Specifies optional evaluation config paths to be invoked by the pipeline iteratively. "
+                    + "Each evaluation configuration will be used with each optimization config."
+                    + "If the path points to a directory, all files inside are chosen to get invoked.")
+    private Path[] evaluationConfigs;
+
+    /**
+     * Runs the optimization and evaluation pipelines based on the provided configuration files.
+     * It first loads the optimization and evaluation configurations, then executes the evaluation
+     * pipeline for each evaluation configuration. This is the unoptimized baseline evaluation. <br>
+     * After that, it runs the optimization pipeline for
+     * each optimization configuration, and subsequently evaluates the optimized prompt using each
+     * evaluation configuration once more with the optimized prompt instead of the original one.
+     */
+    @Override
+    public void run() {
+        List<Path> configsToOptimize = loadConfigs(optimizationConfigs);
+        List<Path> configsToEvaluate = loadConfigs(evaluationConfigs);
+        logger.info(
+                "Found {} optimization config files and {} evaluation config files to invoke",
+                configsToOptimize.size(),
+                configsToEvaluate.size());
+
+        for (Path evaluationConfig : configsToEvaluate) {
+            runEvaluation(evaluationConfig, "");
+        }
+
+        for (Path optimizationConfig : configsToOptimize) {
+            String optimizedPrompt = runOptimization(optimizationConfig);
+            if (optimizedPrompt.isEmpty()) {
+                logger.warn(
+                        "Skipping evaluation for optimization config '{}' as no optimized prompt was generated.",
+                        optimizationConfig);
+                continue;
+            }
+            for (Path evaluationConfig : configsToEvaluate) {
+                runEvaluation(evaluationConfig, optimizedPrompt);
+            }
+        }
+    }
+
+    /**
+     * Runs the optimization pipeline using the specified configuration file.
+     *
+     * @param optimizationConfig The path to the optimization configuration file
+     * @return The optimized prompt generated by the optimization pipeline
+     */
+    private static String runOptimization(Path optimizationConfig) {
+        logger.info("Invoking the optimization pipeline with '{}'", optimizationConfig);
+        String optimizedPrompt = "";
+        try {
+            var optimization = new Optimization(optimizationConfig);
+            optimizedPrompt = optimization.run();
+        } catch (IOException e) {
+            logger.warn(
+                    "Optimization configuration '{}' threw an exception: {} \n Maybe the file does not exist?",
+                    optimizationConfig,
+                    e.getMessage());
+        }
+        return optimizedPrompt;
+    }
+
+    private static void runEvaluation(Path evaluationConfig, String optimizedPrompt) {
+        logger.info("Invoking the evaluation pipeline with '{}'", evaluationConfig);
+        try {
+            var evaluation = new Evaluation(evaluationConfig, optimizedPrompt);
+            evaluation.run();
+        } catch (IOException e) {
+            logger.warn(
+                    "Baseline evaluation configuration '{}' threw an exception: {} \n Maybe the file does not exist?",
+                    evaluationConfig,
+                    e.getMessage());
+        }
+    }
+}