Skip to content

Comments

codeflash-omni-java#1199

Draft
misrasaurabh1 wants to merge 324 commits intomainfrom
omni-java
Draft

codeflash-omni-java#1199
misrasaurabh1 wants to merge 324 commits intomainfrom
omni-java

Conversation

@misrasaurabh1
Copy link
Contributor

No description provided.

Comment on lines +231 to +235
project_root = Path.cwd()

# Check for existing codeflash config in pom.xml or a separate config file
codeflash_config_path = project_root / "codeflash.toml"
if codeflash_config_path.exists():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 70% (0.70x) speedup for should_modify_java_config in codeflash/cli_cmds/init_java.py

⏱️ Runtime : 714 microseconds 421 microseconds (best of 60 runs)

📝 Explanation and details

The optimized code achieves a 69% speedup (714μs → 421μs) by replacing pathlib.Path operations with equivalent os module functions, which have significantly lower overhead.

Key optimizations:

  1. os.getcwd() instead of Path.cwd(): The line profiler shows Path.cwd() took 689,637ns (34.1% of total time) vs os.getcwd() taking only 68,036ns (7.4%). This is a ~10x improvement because Path.cwd() instantiates a Path object and performs additional normalization, while os.getcwd() returns a raw string from a system call.

  2. os.path.join() instead of Path division operator: Constructing the config path via project_root / "codeflash.toml" took 386,582ns (19.1%) vs os.path.join() taking 190,345ns (20.6%). Though the percentage appears similar, the absolute time is ~50% faster because the / operator creates a new Path object with its associated overhead.

  3. os.path.exists() instead of Path.exists(): The existence check dropped from 476,490ns (23.6%) to 223,477ns (24.2%) - roughly 2x faster. The os.path.exists() function directly calls the stat syscall, while Path.exists() goes through Path's object model.

Why this works:
Path objects provide a cleaner API but add object instantiation, method dispatch, and normalization overhead. For simple filesystem checks in initialization code that runs frequently, using lower-level os functions eliminates this overhead while maintaining identical functionality.

Test results:
All test cases show 68-111% speedup across scenarios including:

  • Empty directories (fastest: 82-87% improvement)
  • Large directories with 500 files (68-111% improvement)
  • Edge cases like symlinks and directory-as-file (75-82% improvement)

The optimization is particularly beneficial for CLI initialization code that may run on every command invocation, where sub-millisecond improvements in frequently-called functions compound into noticeable user experience gains.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 23 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

# imports
import os
from pathlib import Path
from typing import Any

import pytest  # used for our unit tests
from codeflash.cli_cmds.init_java import should_modify_java_config

def test_no_config_file_does_not_prompt_and_returns_true(monkeypatch, tmp_path):
    # Arrange: ensure working directory has no codeflash.toml
    monkeypatch.chdir(tmp_path)  # set cwd to a clean temporary directory

    # Replace Confirm.ask with a function that fails the test if called.
    def fail_if_called(*args, **kwargs):
        raise AssertionError("Confirm.ask should not be called when no config file exists")

    # Patch the exact attribute that the function imports at runtime.
    monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)

    # Act: call function under test
    codeflash_output = should_modify_java_config(); result = codeflash_output # 28.9μs -> 15.9μs (82.0% faster)

def test_config_file_exists_prompts_and_respects_true_choice(monkeypatch, tmp_path):
    # Arrange: create a codeflash.toml file so the function will detect it
    monkeypatch.chdir(tmp_path)
    config_file = tmp_path / "codeflash.toml"
    config_file.write_text("existing = true")  # create the file

    # Capture the arguments passed to Confirm.ask and return True to simulate user acceptance
    called = {}

    def fake_ask(prompt, default, show_default):
        # Record inputs for later assertions
        called["prompt"] = prompt
        called["default"] = default
        called["show_default"] = show_default
        return True

    # Patch Confirm.ask used inside the function
    monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 25.6μs -> 13.7μs (86.9% faster)

def test_config_file_exists_prompts_and_respects_false_choice(monkeypatch, tmp_path):
    # Arrange: create the config file
    monkeypatch.chdir(tmp_path)
    (tmp_path / "codeflash.toml").write_text("existing = true")

    # Simulate user declining re-configuration
    def fake_ask_decline(prompt, default, show_default):
        return False

    monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask_decline, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 24.7μs -> 13.3μs (86.3% faster)

def test_presence_of_pom_xml_does_not_trigger_prompt(monkeypatch, tmp_path):
    # Arrange: create a pom.xml but NOT codeflash.toml
    monkeypatch.chdir(tmp_path)
    (tmp_path / "pom.xml").write_text("<project></project>")

    # If Confirm.ask is called, fail the test because only codeflash.toml should trigger it in current implementation
    def fail_if_called(*args, **kwargs):
        raise AssertionError("Confirm.ask should not be called when only pom.xml exists (implementation checks codeflash.toml)")

    monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 28.3μs -> 16.6μs (69.9% faster)

def test_codeflash_config_is_directory_triggers_prompt(monkeypatch, tmp_path):
    # Arrange: create a directory named codeflash.toml (Path.exists will be True)
    monkeypatch.chdir(tmp_path)
    (tmp_path / "codeflash.toml").mkdir()

    # Simulate user selecting True
    monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: True, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 23.6μs -> 12.9μs (82.2% faster)

def test_codeflash_config_symlink_triggers_prompt_if_supported(monkeypatch, tmp_path):
    # Arrange: attempt to create a symlink to a real file; skip if symlink not supported
    if not hasattr(os, "symlink"):
        pytest.skip("Platform does not support os.symlink; skipping symlink test")

    real = tmp_path / "real_config"
    real.write_text("x = 1")
    link = tmp_path / "codeflash.toml"

    try:
        os.symlink(real, link)  # may fail on Windows without privileges
    except (OSError, NotImplementedError) as e:
        pytest.skip(f"Could not create symlink on this platform/environment: {e}")

    monkeypatch.chdir(tmp_path)

    # Simulate user declining re-configuration
    monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: False, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 24.9μs -> 14.2μs (75.7% faster)

def test_large_directory_without_config_is_fast_and_does_not_prompt(monkeypatch, tmp_path):
    # Large scale scenario: create many files (but under 1000) to simulate busy project directory.
    monkeypatch.chdir(tmp_path)
    num_files = 500  # under the 1000 element guideline
    for i in range(num_files):
        # Create many innocuous files; should not affect the function's behavior
        (tmp_path / f"file_{i}.txt").write_text(str(i))

    # Ensure Confirm.ask is not called
    def fail_if_called(*args, **kwargs):
        raise AssertionError("Confirm.ask should not be called when codeflash.toml is absent even in large directories")

    monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 36.3μs -> 21.6μs (68.0% faster)

def test_large_directory_with_config_prompts_once(monkeypatch, tmp_path):
    # Large scale scenario with config present: many files plus codeflash.toml
    monkeypatch.chdir(tmp_path)
    num_files = 500
    for i in range(num_files):
        (tmp_path / f"file_{i}.txt").write_text(str(i))

    # Create the config file that should trigger prompting
    (tmp_path / "codeflash.toml").write_text("reconfigure = maybe")

    # Track how many times Confirm.ask is invoked to ensure single prompt
    counter = {"calls": 0}

    def fake_ask(prompt, default, show_default):
        counter["calls"] += 1
        return True

    monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)

    # Act
    codeflash_output = should_modify_java_config(); result = codeflash_output # 30.8μs -> 14.6μs (111% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import os
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

# imports
import pytest
from codeflash.cli_cmds.init_java import should_modify_java_config

class TestShouldModifyJavaConfigBasic:
    """Basic test cases for should_modify_java_config function."""

    def test_no_config_file_exists_returns_true(self):
        """
        Scenario: Project has no existing codeflash.toml file
        Expected: Function returns (True, None) without prompting user
        """
        # Create a temporary directory without codeflash.toml
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_config_file_exists_user_confirms(self):
        """
        Scenario: Project has existing codeflash.toml and user confirms re-configuration
        Expected: Function prompts user and returns (True, None) if user confirms
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create a codeflash.toml file
                config_file = Path(tmpdir) / "codeflash.toml"
                config_file.touch()

                # Mock the Confirm.ask to return True (user confirms)
                with patch('rich.prompt.Confirm.ask', return_value=True):
                    codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_config_file_exists_user_declines(self):
        """
        Scenario: Project has existing codeflash.toml and user declines re-configuration
        Expected: Function prompts user and returns (False, None) if user declines
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create a codeflash.toml file
                config_file = Path(tmpdir) / "codeflash.toml"
                config_file.touch()

                # Mock the Confirm.ask to return False (user declines)
                with patch('rich.prompt.Confirm.ask', return_value=False):
                    codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_return_tuple_structure(self):
        """
        Scenario: Verify the function always returns a tuple with specific structure
        Expected: Return value is a tuple of (bool, None)
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

class TestShouldModifyJavaConfigEdgeCases:
    """Edge case test cases for should_modify_java_config function."""

    def test_config_file_exists_but_empty(self):
        """
        Scenario: codeflash.toml file exists but is empty
        Expected: File is still considered as existing, prompts user
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create an empty codeflash.toml file
                config_file = Path(tmpdir) / "codeflash.toml"
                config_file.write_text("")

                with patch('rich.prompt.Confirm.ask', return_value=True):
                    codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_config_file_with_content(self):
        """
        Scenario: codeflash.toml file exists with actual TOML content
        Expected: Prompts user regardless of file content
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create a codeflash.toml file with content
                config_file = Path(tmpdir) / "codeflash.toml"
                config_file.write_text("[codeflash]\nversion = 1\n")

                with patch('rich.prompt.Confirm.ask', return_value=False):
                    codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_config_file_case_sensitive(self):
        """
        Scenario: Directory has 'Codeflash.toml' or 'CODEFLASH.TOML' instead of lowercase
        Expected: Function only recognizes 'codeflash.toml' (case-sensitive on Unix)
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create a file with different casing
                config_file = Path(tmpdir) / "Codeflash.toml"
                config_file.touch()

                codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    def test_config_file_is_directory_not_file(self):
        """
        Scenario: codeflash.toml exists as a directory instead of a file
        Expected: Path.exists() still returns True, prompts user
        """
        with tempfile.TemporaryDirectory() as tmpdir:
            original_cwd = os.getcwd()
            try:
                os.chdir(tmpdir)
                # Create codeflash.toml as a directory
                config_dir = Path(tmpdir) / "codeflash.toml"
                config_dir.mkdir()

                with patch('rich.prompt.Confirm.ask', return_value=True):
                    codeflash_output = should_modify_java_config(); result = codeflash_output
            finally:
                os.chdir(original_cwd)

    

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T21.20.00

Suggested change
project_root = Path.cwd()
# Check for existing codeflash config in pom.xml or a separate config file
codeflash_config_path = project_root / "codeflash.toml"
if codeflash_config_path.exists():
project_root = os.getcwd()
# Check for existing codeflash config in pom.xml or a separate config file
codeflash_config_path = os.path.join(project_root, "codeflash.toml")
if os.path.exists(codeflash_config_path):

Comment on lines 268 to 270
if os.path.exists("mvnw"):
return "./mvnw"
if os.path.exists("mvnw.cmd"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 32% (0.32x) speedup for find_maven_executable in codeflash/languages/java/build_tools.py

⏱️ Runtime : 584 microseconds 441 microseconds (best of 81 runs)

📝 Explanation and details

The optimization achieves a 32% runtime improvement (from 584μs to 441μs) by replacing os.path.exists() with os.access() for file existence checks. This change delivers measurable performance gains across all test scenarios.

Key Optimization:
The code replaces os.path.exists("mvnw") with os.access("mvnw", os.F_OK). While both functions check for file existence, os.access() with the os.F_OK flag is more efficient because:

  • It performs a direct system call (access()) that's optimized for permission/existence checks
  • os.path.exists() internally does additional path normalization and exception handling that adds overhead
  • For simple existence checks, os.access() avoids Python-level abstraction layers

Performance Impact by Scenario:
The line profiler shows that the wrapper checks (lines checking for "mvnw" and "mvnw.cmd") improved from ~576ns + 139ns to ~317ns + 76ns - nearly 2x faster for these critical paths. Test results confirm consistent improvements:

  • Wrapper present cases: 68-84% faster (5.78μs → 3.32μs)
  • No wrapper, system Maven cases: 31-52% faster
  • Edge cases (directories, symlinks): 56-77% faster

Why This Matters:
Based on the function references, find_maven_executable() is called from test infrastructure and build tool detection code. While not in an obvious hot loop, build tool detection typically occurs at project initialization and in test setup/teardown - contexts where this function may be called repeatedly. The optimization is particularly valuable when:

  • Running large test suites that reinitialize build contexts frequently
  • Working in CI/CD environments with repeated project setup
  • Dealing with directories containing many files (test shows 77% improvement with 500 files present)

The optimization maintains identical semantics - both os.path.exists() and os.access(..., os.F_OK) return True for files, directories, and symlinks, ensuring backward compatibility while delivering consistent double-digit runtime improvements.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 34 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import os
import pathlib
import shutil

import pytest  # used for our unit tests
from codeflash.languages.java.build_tools import find_maven_executable

def test_prefers_mvnw_wrapper_when_present(tmp_path, monkeypatch):
    # Create an isolated temporary directory and switch to it
    # so os.path.exists checks only our test files.
    monkeypatch.chdir(tmp_path)

    # Create a file named "mvnw" to simulate the Maven wrapper being present.
    (tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")

    # Call the real function under test and assert it returns the wrapper path.
    # According to implementation, when "mvnw" exists it should return "./mvnw".
    codeflash_output = find_maven_executable() # 5.78μs -> 3.32μs (74.3% faster)

def test_returns_mvnw_cmd_when_only_windows_wrapper_exists(tmp_path, monkeypatch):
    # Switch to a fresh temporary directory for isolation.
    monkeypatch.chdir(tmp_path)

    # Create only "mvnw.cmd" and ensure no plain "mvnw" exists.
    (tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")

    # The function should detect "mvnw.cmd" and return that exact string.
    codeflash_output = find_maven_executable() # 13.2μs -> 7.16μs (84.0% faster)

def test_prefers_mvnw_over_mvnw_cmd_when_both_present(tmp_path, monkeypatch):
    # Ensure both wrapper files exist; "mvnw" should be preferred because it's checked first.
    monkeypatch.chdir(tmp_path)
    (tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
    (tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")

    # Confirm that "./mvnw" is returned, demonstrating the precedence.
    codeflash_output = find_maven_executable() # 5.58μs -> 3.32μs (68.3% faster)

def test_returns_system_mvn_when_no_wrappers(monkeypatch, tmp_path):
    # Make sure current directory has no wrapper files.
    monkeypatch.chdir(tmp_path)

    # Monkeypatch shutil.which to simulate an installed mvn on PATH.
    monkeypatch.setattr(shutil, "which", lambda name: "/usr/bin/mvn" if name == "mvn" else None)

    # The function should return whatever shutil.which returns when no wrappers present.
    codeflash_output = find_maven_executable() # 14.0μs -> 9.18μs (52.3% faster)

def test_returns_none_when_nothing_found(monkeypatch, tmp_path):
    # No wrapper files in cwd.
    monkeypatch.chdir(tmp_path)

    # Simulate no mvn on PATH by returning None (or falsy string).
    monkeypatch.setattr(shutil, "which", lambda name: None)

    # Expect None when neither wrapper nor system Maven is found.
    codeflash_output = find_maven_executable() # 13.6μs -> 8.93μs (52.2% faster)

def test_ignores_empty_string_from_which(monkeypatch, tmp_path):
    # If shutil.which returns an empty string (falsy), function should treat it as not found.
    monkeypatch.chdir(tmp_path)
    monkeypatch.setattr(shutil, "which", lambda name: "")

    # Expect None because empty string is falsy and treated like "not found".
    codeflash_output = find_maven_executable() # 13.3μs -> 8.87μs (49.5% faster)

def test_directory_named_mvnw_counts_as_exists(tmp_path, monkeypatch):
    # Create a directory named "mvnw" (os.path.exists returns True for directories).
    monkeypatch.chdir(tmp_path)
    (tmp_path / "mvnw").mkdir()

    # The function checks os.path.exists only, so it should return "./mvnw" even if it's a directory.
    codeflash_output = find_maven_executable() # 5.50μs -> 3.11μs (77.1% faster)

def test_symlink_wrapper_to_existing_target(tmp_path, monkeypatch):
    # Create a real target file and a symlink named "mvnw" pointing to it.
    monkeypatch.chdir(tmp_path)
    target = tmp_path / "real_mvnw"
    target.write_text("#!/bin/sh\necho real\n")
    symlink = tmp_path / "mvnw"
    # Create a symlink; ensure platform supports it (on Windows this may require admin, so skip if not possible).
    try:
        symlink.symlink_to(target)
    except (OSError, NotImplementedError):
        pytest.skip("Symlinks not supported in this environment")
    # The symlink points to an existing file, so os.path.exists should be True and wrapper detected.
    codeflash_output = find_maven_executable() # 7.11μs -> 4.56μs (56.1% faster)

def test_wrapper_has_precedence_over_system_mvn(monkeypatch, tmp_path):
    # Even if shutil.which finds a system mvn, a wrapper present in cwd must take precedence.
    monkeypatch.chdir(tmp_path)
    (tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
    monkeypatch.setattr(shutil, "which", lambda name: "/usr/local/bin/mvn")

    # Confirm wrapper is returned, not the system path.
    codeflash_output = find_maven_executable() # 5.59μs -> 3.33μs (68.1% faster)

def test_large_number_of_files_with_wrapper_present(tmp_path, monkeypatch):
    # Create many files to simulate a crowded project directory.
    monkeypatch.chdir(tmp_path)

    # Create 500 dummy files (well under the 1000-element limit).
    for i in range(500):
        (tmp_path / f"file_{i}.txt").write_text(f"dummy {i}")

    # Place the wrapper among many files and confirm detection remains correct.
    (tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")

    # The function should still return the wrapper path quickly and correctly.
    codeflash_output = find_maven_executable() # 6.15μs -> 3.47μs (77.4% faster)

def test_large_number_of_files_without_wrapper_uses_system_mvn(monkeypatch, tmp_path):
    # With many files but no wrapper, the function should fall back to shutil.which.
    monkeypatch.chdir(tmp_path)

    for i in range(250):
        (tmp_path / f"other_{i}.data").write_text("x" * 10)

    # Simulate a system Maven found on PATH.
    monkeypatch.setattr(shutil, "which", lambda name: r"C:\Program Files\Apache\Maven\bin\mvn.bat" if name == "mvn" else None)

    # Return should be the system path provided by shutil.which.
    codeflash_output = find_maven_executable() # 22.0μs -> 16.7μs (31.6% faster)

def test_multiple_invocations_return_same_result(tmp_path, monkeypatch):
    # Ensure stable behavior across multiple calls with same environment.
    monkeypatch.chdir(tmp_path)
    (tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
    codeflash_output = find_maven_executable(); first = codeflash_output # 5.66μs -> 3.30μs (71.7% faster)
    codeflash_output = find_maven_executable(); second = codeflash_output # 2.88μs -> 1.66μs (73.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import os
import shutil
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

import pytest
from codeflash.languages.java.build_tools import find_maven_executable

def test_finds_mvnw_in_current_directory():
    """Test that find_maven_executable returns ./mvnw when mvnw exists in current directory."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw file
            mvnw_path = os.path.join(tmpdir, "mvnw")
            Path(mvnw_path).touch()
            
            codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_finds_mvnw_cmd_in_current_directory():
    """Test that find_maven_executable returns mvnw.cmd when mvnw.cmd exists and mvnw does not."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw.cmd file
            mvnw_cmd_path = os.path.join(tmpdir, "mvnw.cmd")
            Path(mvnw_cmd_path).touch()
            
            codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_prefers_mvnw_over_mvnw_cmd():
    """Test that find_maven_executable prefers ./mvnw over mvnw.cmd when both exist."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create both mvnw and mvnw.cmd files
            Path(os.path.join(tmpdir, "mvnw")).touch()
            Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
            
            codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_finds_system_maven_when_wrappers_not_present():
    """Test that find_maven_executable finds system Maven when wrappers are not present."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return a maven path
            with patch('shutil.which') as mock_which:
                mock_which.return_value = "/usr/bin/mvn"
                
                codeflash_output = find_maven_executable(); result = codeflash_output
                mock_which.assert_called_once_with("mvn")
        finally:
            os.chdir(original_dir)

def test_returns_none_when_no_maven_found():
    """Test that find_maven_executable returns None when no Maven executable is found."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return None
            with patch('shutil.which') as mock_which:
                mock_which.return_value = None
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_mvnw_wrapper_takes_priority_over_system_maven():
    """Test that ./mvnw is returned even when system Maven is available."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw file
            Path(os.path.join(tmpdir, "mvnw")).touch()
            
            # Mock shutil.which to return a system maven path
            with patch('shutil.which') as mock_which:
                mock_which.return_value = "/usr/bin/mvn"
                
                codeflash_output = find_maven_executable(); result = codeflash_output
                mock_which.assert_not_called()
        finally:
            os.chdir(original_dir)

def test_mvnw_cmd_takes_priority_over_system_maven():
    """Test that mvnw.cmd is returned even when system Maven is available."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw.cmd file
            Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
            
            # Mock shutil.which to return a system maven path
            with patch('shutil.which') as mock_which:
                mock_which.return_value = "/usr/bin/mvn"
                
                codeflash_output = find_maven_executable(); result = codeflash_output
                mock_which.assert_not_called()
        finally:
            os.chdir(original_dir)

def test_handles_system_maven_with_absolute_path():
    """Test that find_maven_executable correctly returns absolute path for system Maven."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return an absolute path
            with patch('shutil.which') as mock_which:
                absolute_path = "/opt/maven/bin/mvn"
                mock_which.return_value = absolute_path
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_handles_system_maven_with_relative_path():
    """Test that find_maven_executable correctly returns relative path for system Maven."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return a relative path
            with patch('shutil.which') as mock_which:
                relative_path = "./bin/mvn"
                mock_which.return_value = relative_path
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_mvnw_exists_as_directory_not_file():
    """Test behavior when 'mvnw' exists but is a directory, not a file."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw as a directory
            os.makedirs(os.path.join(tmpdir, "mvnw"))
            
            # Mock shutil.which to return None (so it falls through to system check)
            with patch('shutil.which') as mock_which:
                mock_which.return_value = None
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_mvnw_cmd_exists_as_directory_not_file():
    """Test behavior when 'mvnw.cmd' exists but is a directory, not a file."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw.cmd as a directory
            os.makedirs(os.path.join(tmpdir, "mvnw.cmd"))
            
            # Mock shutil.which to return None
            with patch('shutil.which') as mock_which:
                mock_which.return_value = None
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_empty_string_from_system_maven():
    """Test handling when shutil.which returns an empty string."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return an empty string
            with patch('shutil.which') as mock_which:
                mock_which.return_value = ""
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_whitespace_string_from_system_maven():
    """Test handling when shutil.which returns a whitespace string."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Mock shutil.which to return a whitespace string
            with patch('shutil.which') as mock_which:
                mock_which.return_value = "   "
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_finds_maven_in_directory_with_many_files():
    """Test that find_maven_executable works correctly in a directory with many files."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create many files in the directory
            for i in range(100):
                Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
            
            # Create mvnw
            Path(os.path.join(tmpdir, "mvnw")).touch()
            
            codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_finds_mvnw_cmd_in_directory_with_many_files():
    """Test that find_maven_executable finds mvnw.cmd in a directory with many files."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create many files in the directory
            for i in range(100):
                Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
            
            # Create mvnw.cmd
            Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
            
            codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_performance_with_no_maven_in_large_directory():
    """Test that find_maven_executable performs well when returning None in a large directory."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create many files to simulate a large project directory
            for i in range(500):
                Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
            
            # Mock shutil.which to return None
            with patch('shutil.which') as mock_which:
                mock_which.return_value = None
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_multiple_calls_return_consistent_results():
    """Test that multiple calls to find_maven_executable return consistent results."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create mvnw
            Path(os.path.join(tmpdir, "mvnw")).touch()
            
            # Call find_maven_executable multiple times
            results = [find_maven_executable() for _ in range(50)]
        finally:
            os.chdir(original_dir)

def test_switching_directories_finds_correct_maven():
    """Test that find_maven_executable correctly finds Maven when switching directories."""
    with tempfile.TemporaryDirectory() as tmpdir1:
        with tempfile.TemporaryDirectory() as tmpdir2:
            original_dir = os.getcwd()
            try:
                # First directory with mvnw
                os.chdir(tmpdir1)
                Path(os.path.join(tmpdir1, "mvnw")).touch()
                codeflash_output = find_maven_executable(); result1 = codeflash_output
                
                # Second directory without mvnw
                os.chdir(tmpdir2)
                with patch('shutil.which') as mock_which:
                    mock_which.return_value = "/usr/bin/mvn"
                    codeflash_output = find_maven_executable(); result2 = codeflash_output
            finally:
                os.chdir(original_dir)

def test_finds_system_maven_with_long_path():
    """Test that find_maven_executable handles system Maven with a very long path."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create a very long path for Maven
            long_path = "/very/long/path/" + "subdirectory/" * 50 + "mvn"
            
            with patch('shutil.which') as mock_which:
                mock_which.return_value = long_path
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)

def test_finds_system_maven_with_special_characters_in_path():
    """Test that find_maven_executable handles system Maven with special characters in path."""
    with tempfile.TemporaryDirectory() as tmpdir:
        original_dir = os.getcwd()
        try:
            os.chdir(tmpdir)
            # Create a path with special characters
            special_path = "/opt/maven-3.8.1/bin/mvn"
            
            with patch('shutil.which') as mock_which:
                mock_which.return_value = special_path
                
                codeflash_output = find_maven_executable(); result = codeflash_output
        finally:
            os.chdir(original_dir)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from codeflash.languages.java.build_tools import find_maven_executable

def test_find_maven_executable():
    find_maven_executable()
🔎 Click to see Concolic Coverage Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_34v0t72u/tmp1x2llvvp/test_concolic_coverage.py::test_find_maven_executable 81.3μs 78.4μs 3.65%✅

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.07.44

Suggested change
if os.path.exists("mvnw"):
return "./mvnw"
if os.path.exists("mvnw.cmd"):
if os.access("mvnw", os.F_OK):
return "./mvnw"
if os.access("mvnw.cmd", os.F_OK):

Comment on lines +830 to +852
while pos < len(content):
next_open = content.find(open_tag, pos)
next_open_short = content.find(open_tag_short, pos)
next_close = content.find(close_tag, pos)

if next_close == -1:
return -1

# Find the earliest opening tag (if any)
candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close]
next_open_any = min(candidates) if candidates else len(content) + 1

if next_open_any < next_close:
# Found opening tag first - nested tag
depth += 1
pos = next_open_any + 1
else:
# Found closing tag first
depth -= 1
if depth == 0:
return next_close
pos = next_close + len(close_tag)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 84% (0.84x) speedup for _find_closing_tag in codeflash/languages/java/build_tools.py

⏱️ Runtime : 1.01 milliseconds 548 microseconds (best of 233 runs)

📝 Explanation and details

The optimized code achieves an 83% speedup (from 1.01ms to 548μs) by fundamentally changing the search strategy from multiple independent substring searches to a single progressive scan.

Key Optimization:

The original code performs three separate content.find() calls per iteration to locate <tag>, <tag , and </tag> patterns, then constructs a candidate list to determine which appears first. This results in redundant scanning of the same content regions multiple times.

The optimized version instead:

  1. Finds the next < character once with content.find("<", pos)
  2. Uses content.startswith() at that position to check if it's a relevant opening or closing tag
  3. Eliminates the candidate list construction and min() operation

Why This Is Faster:

  • Reduced string searches: One find("<") call instead of three find() calls searching for longer patterns
  • Earlier bailout: When no < is found, we immediately return -1 without further checks
  • Eliminated allocations: No list comprehension creating the candidates list on each iteration
  • Better locality: startswith() checks are O(k) where k is the tag length, performed only once at the found position

Performance Characteristics:

The test results show the optimization excels with:

  • Nested same-name tags: test_large_nested_tags_scalability shows 680% speedup (713μs → 91.5μs) for 200 nested levels
  • Simple structures: Most simple cases show 50-100% speedup (e.g., test_basic_single_pair 55.9% faster)
  • Missing closing tags: test_performance_with_large_string_no_match shows 745% speedup (13.7μs → 1.62μs)

The optimization performs slightly worse on content with many different tag types at the same level (e.g., test_large_content_simple 90% slower) because it must scan through more < characters that aren't relevant to the target tag. However, the overall runtime improvement in typical XML parsing scenarios (nested same-name tags, sequential scanning) makes this an excellent trade-off.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 53 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 3 Passed
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.build_tools import _find_closing_tag

def test_basic_single_pair():
    # Basic: single matching pair should return the index of the closing tag
    content = "<root>hello</root>"
    start = content.find("<root")  # position of the opening tag
    expected_close = content.find("</root>")  # expected position of closing tag
    # The function should find the closing tag start index
    codeflash_output = _find_closing_tag(content, start, "root") # 2.65μs -> 1.70μs (55.9% faster)

def test_nested_same_tag_simple():
    # Nested tags of same name: outer must match its own closing tag, not inner
    content = "<a><a>inner</a>outer</a>"
    start_outer = content.find("<a>")  # first opening tag
    # expected closing for outermost is the last occurrence of "</a>"
    expected_outer_close = content.rfind("</a>")
    codeflash_output = _find_closing_tag(content, start_outer, "a") # 5.10μs -> 2.63μs (93.5% faster)

def test_with_attributes_and_spaces():
    # Opening tags with attributes (using "<tag " form) must be recognized as openings
    content = "<tag attr='1'>text<tag attr2='2'>inner</tag></tag>"
    start = content.find("<tag")  # first opening (with attributes)
    expected_close = content.rfind("</tag>")
    codeflash_output = _find_closing_tag(content, start, "tag") # 5.09μs -> 2.60μs (96.1% faster)

def test_missing_closing_returns_minus_one():
    # When a closing tag is missing entirely, the function should return -1
    content = "<x>no close here"
    start = content.find("<x")
    codeflash_output = _find_closing_tag(content, start, "x") # 1.75μs -> 1.36μs (28.7% faster)

def test_similar_tag_names_not_confused():
    # Ensure tags with similar names (e.g., <a> vs <ab>) do not confuse matching
    content = "<a><ab></ab></a>"
    start = content.find("<a")
    expected_close = content.find("</a>")
    # The function should match the </a> closing tag, not get fooled by <ab>
    codeflash_output = _find_closing_tag(content, start, "a") # 2.58μs -> 2.50μs (3.61% faster)

def test_self_closing_tag_returns_minus_one():
    # Self-closing tags like <a/> have no corresponding </a>, so result should be -1
    content = "<a/>"
    start = content.find("<a")
    # Even though start points to the tag, there is no closing tag, so expect -1
    codeflash_output = _find_closing_tag(content, start, "a") # 1.55μs -> 1.27μs (22.1% faster)

def test_start_pos_not_zero_and_multiple_instances():
    # When there are multiple sibling tags, ensure we can target the second one by start_pos
    content = "pre<a>one</a><a>two</a>post"
    # locate the second <a> by searching after the first one
    first = content.find("<a>")
    second = content.find("<a>", first + 1)
    expected_close_second = content.find("</a>", second)
    # The function should find the closing tag corresponding to the second opening
    codeflash_output = _find_closing_tag(content, second, "a") # 2.35μs -> 1.43μs (64.3% faster)

def test_open_tag_with_space_only_and_plain_variant_later():
    # If only an open_tag_short appears (i.e., "<tag " with attributes) before a closing,
    # the algorithm must still count it as an opening.
    content = "<b attr=1><b>inner</b></b>"
    start = content.find("<b")
    # ensure that the outer closing is matched
    expected_close_outer = content.rfind("</b>")
    codeflash_output = _find_closing_tag(content, start, "b") # 4.91μs -> 2.40μs (105% faster)

def test_partial_start_pos_inside_opening_still_finds_closing():
    # If start_pos is slightly offset (caller error), the code still attempts to find a closing.
    # This ensures the function is somewhat robust to non-zero offsets inside the opening tag.
    content = "<a>text</a>"
    actual_open = content.find("<a>")
    # pick a start_pos one character after the '<' (inside the opening)
    start_offset = actual_open + 1
    # Even if start_pos is not exactly the '<', the function should still locate the closing tag
    expected_close = content.find("</a>")
    codeflash_output = _find_closing_tag(content, start_offset, "a") # 2.36μs -> 1.44μs (63.8% faster)

def test_multiple_opening_variants_only_open_tag_short_exists():
    # Only "<tag " variant exists (no plain "<tag>") - ensure detection of nested openings works
    content = "<div class='x'><div id='y'></div></div>"
    start = content.find("<div")
    expected_close = content.rfind("</div>")
    codeflash_output = _find_closing_tag(content, start, "div") # 4.86μs -> 2.60μs (86.5% faster)

def test_large_nested_tags_scalability():
    # Large-scale nested tags to test stack/depth handling but keep under 1000 elements.
    # Create 200 nested tags: <t><t>...x...</t></t>...
    depth = 200
    open_tags = "<t>" * depth
    close_tags = "</t>" * depth
    content = open_tags + "X" + close_tags
    # start position of the outermost opening tag
    start = content.find("<t")
    # The closing index for the outermost is the last </t>
    expected_outer_close = content.rfind("</t>")
    # The function should handle many nested levels and return the outermost closing index
    codeflash_output = _find_closing_tag(content, start, "t") # 713μs -> 91.5μs (680% faster)

def test_interleaved_other_tags_do_not_affect_depth():
    # Tags of other names between nested tags should not affect counting for the target tag_name.
    content = "<x><a><b></b><a><b></b></a></a></x>"
    # There are nested <a> tags with other tags interleaved; find the outermost <a>
    start = content.find("<a")
    # expected closing is the last </a> corresponding to the outermost
    expected_close = content.rfind("</a>")
    codeflash_output = _find_closing_tag(content, start, "a") # 5.06μs -> 3.96μs (27.8% faster)

def test_no_opening_tag_at_start_pos_returns_minus_one_or_misleading():
    # If start_pos points past any opening tag (e.g., at end of content), the function should return -1
    content = "<z></z>"
    # choose a start_pos beyond content length to simulate incorrect caller input
    start = len(content) + 5
    # Since pos will be >= len(content), the while loop will not execute and -1 is returned
    codeflash_output = _find_closing_tag(content, start, "z") # 1.12μs -> 1.28μs (12.5% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.build_tools import _find_closing_tag

def test_simple_single_tag():
    """Test finding closing tag for a simple tag with no nesting."""
    content = "<root>content</root>"
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.75μs -> 1.78μs (54.0% faster)

def test_simple_tag_with_content():
    """Test finding closing tag for a tag containing text content."""
    content = "<div>Hello World</div>"
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.67μs -> 1.81μs (47.5% faster)

def test_tag_with_whitespace_content():
    """Test finding closing tag when content contains whitespace."""
    content = "<span>   </span>"
    codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 2.67μs -> 1.73μs (53.8% faster)

def test_empty_tag():
    """Test finding closing tag for an empty tag."""
    content = "<empty></empty>"
    codeflash_output = _find_closing_tag(content, 0, "empty"); result = codeflash_output # 2.58μs -> 1.63μs (57.6% faster)

def test_tag_with_attributes():
    """Test finding closing tag for a tag with attributes."""
    content = '<element class="test">content</element>'
    codeflash_output = _find_closing_tag(content, 0, "element"); result = codeflash_output # 2.58μs -> 1.68μs (53.6% faster)

def test_tag_with_multiple_attributes():
    """Test finding closing tag for a tag with multiple attributes."""
    content = '<div id="main" class="container">text</div>'
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.70μs -> 1.79μs (50.3% faster)

def test_no_closing_tag():
    """Test when closing tag is missing - should return -1."""
    content = "<root>content"
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.79μs -> 1.42μs (26.2% faster)

def test_nested_tags_one_level():
    """Test finding closing tag with one level of nesting."""
    content = "<parent><child></child></parent>"
    codeflash_output = _find_closing_tag(content, 0, "parent"); result = codeflash_output # 2.67μs -> 2.67μs (0.000% faster)

def test_nested_tags_multiple_levels():
    """Test finding closing tag with multiple levels of nesting."""
    content = "<a><b><c></c></b></a>"
    codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.75μs -> 3.41μs (19.4% slower)

def test_nested_tags_same_name():
    """Test finding closing tag when nested tags have the same name."""
    content = "<div>outer<div>inner</div>text</div>"
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 5.21μs -> 2.62μs (98.5% faster)

def test_nested_tags_same_name_multiple():
    """Test multiple nested tags of the same name."""
    content = "<tag>level1<tag>level2</tag>level1</tag>"
    codeflash_output = _find_closing_tag(content, 0, "tag"); result = codeflash_output # 4.81μs -> 2.50μs (92.1% faster)

def test_closing_tag_at_end():
    """Test when closing tag is at the very end of content."""
    content = "<root>text</root>"
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.62μs -> 1.68μs (55.9% faster)

def test_tag_name_is_single_character():
    """Test with single character tag name."""
    content = "<a>content</a>"
    codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.57μs -> 1.74μs (47.7% faster)

def test_tag_name_is_long():
    """Test with long tag name."""
    content = "<verylongtagnamethatiscomplex>content</verylongtagnamethatiscomplex>"
    codeflash_output = _find_closing_tag(content, 0, "verylongtagnamethatiscomplex"); result = codeflash_output # 2.73μs -> 1.78μs (52.8% faster)

def test_tag_with_numbers():
    """Test tag name containing numbers."""
    content = "<div2>text</div2>"
    codeflash_output = _find_closing_tag(content, 0, "div2"); result = codeflash_output # 2.53μs -> 1.64μs (54.2% faster)

def test_tag_with_hyphens():
    """Test tag name containing hyphens."""
    content = "<my-tag>content</my-tag>"
    codeflash_output = _find_closing_tag(content, 0, "my-tag"); result = codeflash_output # 2.56μs -> 1.71μs (49.6% faster)

def test_nested_different_tags():
    """Test nested tags with different names."""
    content = "<outer><inner>text</inner></outer>"
    codeflash_output = _find_closing_tag(content, 0, "outer"); result = codeflash_output # 2.62μs -> 2.79μs (6.08% slower)

def test_multiple_nested_with_attributes():
    """Test nested tags where some have attributes."""
    content = '<root id="1"><child class="x">content</child></root>'
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.63μs -> 2.58μs (1.93% faster)

def test_tag_with_attribute_containing_tag_like_string():
    """Test tag with attribute value containing tag-like content."""
    content = '<div data="<test>">content</div>'
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.65μs -> 2.28μs (16.2% faster)

def test_start_pos_not_zero():
    """Test when start_pos is not at the beginning."""
    content = "text<root>content</root>more"
    codeflash_output = _find_closing_tag(content, 4, "root"); result = codeflash_output # 2.50μs -> 1.70μs (46.4% faster)

def test_deeply_nested_same_tags():
    """Test deeply nested tags with the same name."""
    content = "<x><x><x></x></x></x>"
    codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 6.69μs -> 3.00μs (123% faster)

def test_tag_with_newlines():
    """Test tag with newline characters in content."""
    content = "<div>\nline1\nline2\n</div>"
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.62μs -> 1.72μs (52.4% faster)

def test_tag_with_tabs():
    """Test tag with tab characters in content."""
    content = "<div>\ttab\tcontent\t</div>"
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.52μs -> 1.71μs (47.4% faster)

def test_consecutive_opening_tags():
    """Test multiple consecutive opening tags of the same name."""
    content = "<span><span>text</span></span>"
    codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 4.99μs -> 2.56μs (94.5% faster)

def test_tag_after_first_but_before_close():
    """Test when there's another tag between opening and closing."""
    content = "<root><other>text</other></root>"
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.67μs -> 2.69μs (1.11% slower)

def test_closing_tag_without_corresponding_opening():
    """Test when there's a closing tag but it doesn't match our opening."""
    content = "<root>text</other>"
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.75μs -> 2.02μs (13.3% slower)

def test_tag_name_with_underscore():
    """Test tag name with underscore characters."""
    content = "<my_tag>content</my_tag>"
    codeflash_output = _find_closing_tag(content, 0, "my_tag"); result = codeflash_output # 2.63μs -> 1.68μs (56.6% faster)

def test_very_short_content():
    """Test with minimal content - just opening tag."""
    content = "<x>"
    codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 1.68μs -> 1.40μs (20.0% faster)

def test_tag_with_self_closing_like_syntax():
    """Test tag that might look self-closing but isn't."""
    content = "<br />content</br>"
    codeflash_output = _find_closing_tag(content, 5, "br"); result = codeflash_output # 2.64μs -> 1.72μs (53.5% faster)

def test_large_content_simple():
    """Test with large content size but simple structure."""
    # Create content with many nested levels (up to 100 levels)
    opening = "".join(f"<tag{i}>" for i in range(100))
    closing = "".join(f"</tag{i}>" for i in range(99, -1, -1))
    content = opening + "CONTENT" + closing
    
    # Find the closing tag for the first tag
    codeflash_output = _find_closing_tag(content, 0, "tag0"); result = codeflash_output # 6.07μs -> 62.7μs (90.3% slower)

def test_large_content_wide_structure():
    """Test with many tags at the same level."""
    # Create content with many sibling tags
    content = "<root>"
    for i in range(100):
        content += f"<item{i}>content</item{i}>"
    content += "</root>"
    
    # Find the closing tag for root
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.57μs -> 63.2μs (89.6% slower)

def test_large_nested_tags_finding_correct_close():
    """Test that with many nested tags, we find the correct closing tag."""
    # Create deeply nested structure: <a><b><c>...<z></z>...</c></b></a>
    alphabet = "abcdefghijklmnopqrstuvwxyz"
    opening = "".join(f"<{char}>" for char in alphabet)
    closing = "".join(f"</{char}>" for char in reversed(alphabet))
    content = opening + "CORE" + closing
    
    # Find the closing tag for 'a' (the outermost)
    codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 3.12μs -> 16.8μs (81.4% slower)

def test_large_content_with_many_attributes():
    """Test with large content containing tags with many attributes."""
    # Create a tag with many attributes
    attributes = ' '.join(f'attr{i}="value{i}"' for i in range(50))
    content = f'<root {attributes}>content</root>'
    
    # Find the closing tag
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 4.56μs -> 1.88μs (142% faster)

def test_large_content_mixed_nesting():
    """Test with large content containing mixed nesting patterns."""
    # Create content with alternating levels of nesting
    content = "<root>"
    for i in range(50):
        content += f"<level1{i}><level2{i}>content</level2{i}></level1{i}>"
    content += "</root>"
    
    # Find the closing tag for root
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.81μs -> 62.9μs (89.2% slower)

def test_large_content_same_name_nesting():
    """Test with many nested tags of the same name."""
    # Create content with 50 levels of the same tag nested
    content = ""
    for i in range(50):
        content += "<div>"
    content += "CONTENT"
    for i in range(50):
        content += "</div>"
    
    # Find the closing tag for the first div
    codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 102μs -> 24.2μs (325% faster)

def test_large_content_finding_middle_tag():
    """Test finding a closing tag for a tag in the middle of large content."""
    # Create content with multiple root-level tags
    content = "<root1>content</root1>"
    content += "<root2><nested>content</nested></root2>"
    for i in range(50):
        content += f"<item{i}>content</item{i}>"
    
    # Find the closing tag for root2 which has nesting
    start_pos = content.find("<root2>")
    codeflash_output = _find_closing_tag(content, start_pos, "root2"); result = codeflash_output # 3.87μs -> 2.58μs (49.6% faster)

def test_performance_with_large_string_no_match():
    """Test performance when there's no closing tag in large content."""
    # Create large content without closing tag
    content = "<root>" + "x" * 10000
    
    # Should return -1 efficiently
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 13.7μs -> 1.62μs (745% faster)

def test_large_content_multiple_tag_searches():
    """Test finding closing tags for multiple tags in large content."""
    # Create content with nested different tag types
    content = "<wrapper>"
    for i in range(100):
        content += f"<container{i}><item>data</item></container{i}>"
    content += "</wrapper>"
    
    # Find the closing tag for wrapper
    codeflash_output = _find_closing_tag(content, 0, "wrapper"); result = codeflash_output # 7.97μs -> 123μs (93.5% slower)

def test_large_content_with_special_characters():
    """Test large content with special characters in values."""
    # Create content with special characters
    special_chars = "!@#$%^&*()_+-=[]{}|;:',.<>?/~`"
    content = f"<root data=\"{special_chars * 10}\">content</root>"
    
    # Find the closing tag
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 3.24μs -> 5.34μs (39.4% slower)

def test_large_content_with_xml_entities():
    """Test large content with XML entities."""
    # Create content with XML entities
    content = "<root>Text with &lt; &gt; &amp; entities</root>"
    
    # Find the closing tag
    codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.69μs -> 1.73μs (54.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from codeflash.languages.java.build_tools import _find_closing_tag

def test__find_closing_tag():
    _find_closing_tag('<></>', -1, '')

def test__find_closing_tag_2():
    _find_closing_tag('', -2, '')

def test__find_closing_tag_3():
    _find_closing_tag('</>', -1, '')
🔎 Click to see Concolic Coverage Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag 4.23μs 2.50μs 69.5%✅
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_2 1.79μs 1.44μs 24.3%✅
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_3 2.48μs 1.67μs 47.9%✅

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.32.35

Click to see suggested changes
Suggested change
while pos < len(content):
next_open = content.find(open_tag, pos)
next_open_short = content.find(open_tag_short, pos)
next_close = content.find(close_tag, pos)
if next_close == -1:
return -1
# Find the earliest opening tag (if any)
candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close]
next_open_any = min(candidates) if candidates else len(content) + 1
if next_open_any < next_close:
# Found opening tag first - nested tag
depth += 1
pos = next_open_any + 1
else:
# Found closing tag first
depth -= 1
if depth == 0:
return next_close
pos = next_close + len(close_tag)
len_close = len(close_tag)
# Scan for the next '<' and then determine whether it's an open/close of interest.
while True:
next_lt = content.find("<", pos)
if next_lt == -1:
return -1
# Check for the relevant closing tag first
if content.startswith(close_tag, next_lt):
# Found closing tag first
depth -= 1
if depth == 0:
return next_lt
pos = next_lt + len_close
continue
# Check for nested opening tags of the exact forms we consider
if content.startswith(open_tag, next_lt) or content.startswith(open_tag_short, next_lt):
depth += 1
pos = next_lt + 1
continue
# Not an open/close we're tracking; move on
pos = next_lt + 1

Comment on lines +369 to +372
part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8")
parts.append(part_text)

return " ".join(parts).strip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 33% (0.33x) speedup for _extract_type_declaration in codeflash/languages/java/context.py

⏱️ Runtime : 133 microseconds 100 microseconds (best of 15 runs)

📝 Explanation and details

The optimized code achieves a 33% runtime improvement (from 133μs to 100μs) by deferring UTF-8 decoding until after joining all byte slices together, rather than decoding each part individually.

Key Optimization:

The original code decoded each child node's byte slice immediately:

part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8")
parts.append(part_text)
return " ".join(parts).strip()

The optimized code collects raw byte slices first, then performs a single decode operation:

parts.append(source_bytes[child.start_byte : child.end_byte])
return b" ".join(parts).decode("utf8").strip()

Why This is Faster:

  1. Reduced decode operations: Instead of calling decode("utf8") once per child node (~527 times in profiled runs), the optimization calls it just once on the final joined bytes
  2. Byte-level joining: b" ".join() on bytes is faster than " ".join() on strings, as it operates on raw bytes without character encoding overhead
  3. Better memory efficiency: Avoids creating intermediate string objects for each part

Performance Impact by Test Case:

The optimization shows particularly strong gains on tests with many tokens:

  • 37.6% faster on large-scale test with 500 tokens
  • 15-16% faster on typical multi-token declarations (interface, enum, unknown types)
  • Neutral/slight regression on trivial cases (empty children) where the overhead is negligible

Line Profiler Evidence:

The bottleneck shifted from line 27 in the original (34.3% of time spent on decode + slice) to line 26 in the optimized version (44.2% on append only, but with 23% less total time overall). The single decode at return now takes 3.1% vs the original's 23.2% spent on multiple appends of decoded strings.

This optimization is particularly valuable for parsing Java files with complex type declarations containing many modifiers, annotations, and generic type parameters.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 8 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

from types import \
    SimpleNamespace  # used to create lightweight node-like objects

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Node

# Helper utilities for tests ---------------------------------------------------

def _make_children_from_tokens_and_body(source: bytes, token_texts: list[str], body_index: int | None, body_type_name: str):
    """
    Construct a list of SimpleNamespace children where each token corresponds to a
    slice in `source`. Tokens are expected to appear in `source` separated by a single
    space. `body_index` indicates the index in token_texts at which a body node should
    be inserted; if None, no body node is inserted.
    Each produced child has attributes: type, start_byte, end_byte.
    """
    children = []
    # locate tokens sequentially in source to compute byte offsets
    offset = 0
    # Copy token_texts to avoid mutating caller's list
    for idx, token in enumerate(token_texts):
        # find token starting at or after offset
        token_bytes = token.encode("utf8")
        pos = source.find(token_bytes, offset)
        if pos == -1:
            raise ValueError(f"Token {token!r} not found in source (from offset {offset}).")
        start = pos
        end = pos + len(token_bytes)
        children.append(SimpleNamespace(type="token", start_byte=start, end_byte=end))
        offset = end + 1  # assume tokens separated by at least one byte (space)
    # Insert body node if requested. Body will cover from the start of the token at body_index to end of source
    if body_index is not None:
        # Determine where the body token starts; it should be the token at body_index
        if not (0 <= body_index < len(children)):
            # if body_index points past tokens, place body at the end
            body_start = len(source)
        else:
            body_start = children[body_index].start_byte
        body_child = SimpleNamespace(type=body_type_name, start_byte=body_start, end_byte=len(source))
        # place body child at the end of the children list (function only checks type and breaks)
        children.append(body_child)
    return children

def test_interface_declaration_stops_before_interface_body():
    # Interface should use 'interface_body' as the body node name and stop before it.
    source_str = "public interface MyInterface extends BaseInterface { void foo(); }"
    source = source_str.encode("utf8")
    tokens = ["public", "interface", "MyInterface", "extends", "BaseInterface"]
    # body_index points to the token position where we consider the body starts (token count)
    children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="interface_body")
    node = SimpleNamespace(children=children)

    codeflash_output = _extract_type_declaration(node, source, "interface"); decl = codeflash_output # 3.67μs -> 3.18μs (15.4% faster)

def test_enum_without_body_returns_all_parts():
    # If no enum_body node exists among children, function should not break early and should include all parts.
    source_str = "public enum Color RED GREEN BLUE"
    source = source_str.encode("utf8")
    tokens = ["public", "enum", "Color"]
    # Do not insert a body node. The function should return everything from the supplied children.
    children = _make_children_from_tokens_and_body(source, tokens, body_index=None, body_type_name="enum_body")
    node = SimpleNamespace(children=children)

    codeflash_output = _extract_type_declaration(node, source, "enum"); decl = codeflash_output # 2.81μs -> 2.54μs (10.2% faster)

def test_empty_children_returns_empty_string():
    # Edge case: type_node has no children -> return empty string (after join & strip)
    node = SimpleNamespace(children=[])
    source = b""
    codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 1.32μs -> 1.34μs (1.49% slower)

def test_unknown_type_kind_defaults_to_class_body():
    # If type_kind is unknown, body_type defaults to 'class_body'
    source_str = "myModifier customType Foo extends Bar { body }"
    source = source_str.encode("utf8")
    tokens = ["myModifier", "customType", "Foo", "extends", "Bar"]
    # Insert a 'class_body' child so unknown maps to class_body and the function stops before it
    children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="class_body")
    node = SimpleNamespace(children=children)

    codeflash_output = _extract_type_declaration(node, source, "unknown_kind"); decl = codeflash_output # 3.76μs -> 3.23μs (16.5% faster)

def test_child_with_empty_slice_produces_empty_segment():
    # If a child has start_byte == end_byte, that yields an empty decoded string.
    # The function will include it as an element; the final join will contain extra space for it.
    # Construct source and children manually where one child corresponds to an empty slice.
    source_str = "public class MyClass"
    source = source_str.encode("utf8")
    # Create two real children for 'public' and 'class' and a third child that's empty (start=end)
    # The third child will contribute an empty string and show up as an additional space once joined.
    # We then append the name child and a body to stop before.
    public_pos = source.find(b"public")
    class_pos = source.find(b"class")
    name_pos = source.find(b"MyClass")
    # children as SimpleNamespace objects
    children = [
        SimpleNamespace(type="token", start_byte=public_pos, end_byte=public_pos + len(b"public")),
        SimpleNamespace(type="token", start_byte=class_pos, end_byte=class_pos + len(b"class")),
        SimpleNamespace(type="token", start_byte=10, end_byte=10),  # empty slice in the middle
        SimpleNamespace(type="token", start_byte=name_pos, end_byte=name_pos + len(b"MyClass")),
        SimpleNamespace(type="class_body", start_byte=name_pos + len(b"MyClass") + 1, end_byte=len(source)),
    ]
    node = SimpleNamespace(children=children)

    codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 3.32μs -> 2.87μs (15.7% faster)

def test_large_number_of_tokens_stops_at_body_and_scales_correctly():
    # Large scale test with many tokens (but under 1000).
    # Ensure the function correctly concatenates many parts and stops at the body node.
    n = 500  # number of tokens to include before body
    tokens = [f"T{i}" for i in range(n)]
    # Build source: tokens separated by spaces, then a body starting with '{'
    source_str = " ".join(tokens) + " {" + " body" + " }"
    source = source_str.encode("utf8")

    # Construct children corresponding to tokens and then the body node
    children = _make_children_from_tokens_and_body(source, tokens, body_index=n, body_type_name="class_body")
    node = SimpleNamespace(children=children)

    codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 113μs -> 82.4μs (37.6% faster)
    # The declaration should be exactly the tokens joined by single spaces
    expected = " ".join(tokens)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Language, Node, Parser

# Helper function to create a tree-sitter node for testing
def _get_parser():
    """Create and return a tree-sitter parser for Java."""
    JAVA_LANGUAGE = Language("build/my-languages.so", "java")
    parser = Parser()
    parser.set_language(JAVA_LANGUAGE)
    return parser

def _parse_java_code(code: str) -> Node:
    """Parse Java code and return the root node."""
    parser = _get_parser()
    tree = parser.parse(code.encode("utf8"))
    return tree.root_node

def _find_type_node(root: Node, type_kind: str) -> Node:
    """Find the first type declaration node of the given kind."""
    def traverse(node: Node) -> Node | None:
        if node.type == type_kind:
            return node
        for child in node.children:
            result = traverse(child)
            if result:
                return result
        return None
    return traverse(root)

def test_empty_class_name():
    """Test that function handles class nodes properly (tree-sitter should parse valid Java)."""
    code = "public class {} "

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-02T00.37.05

Suggested change
part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8")
parts.append(part_text)
return " ".join(parts).strip()
parts.append(source_bytes[child.start_byte : child.end_byte])
return b" ".join(parts).decode("utf8").strip()

@CLAassistant
Copy link

CLAassistant commented Feb 2, 2026

CLA assistant check
All committers have signed the CLA.

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 103% (1.03x) speedup for _add_global_declarations_for_language in codeflash/code_utils/code_replacer.py

⏱️ Runtime : 409 milliseconds 202 milliseconds (best of 20 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 13% (0.13x) speedup for _get_parent_type_name in codeflash/languages/java/context.py

⏱️ Runtime : 45.0 microseconds 39.8 microseconds (best of 30 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 11% (0.11x) speedup for get_optimized_code_for_module in codeflash/code_utils/code_replacer.py

⏱️ Runtime : 1.07 milliseconds 963 microseconds (best of 15 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 149% (1.49x) speedup for normalize_java_code in codeflash/languages/java/formatter.py

⏱️ Runtime : 3.66 milliseconds 1.47 milliseconds (best of 170 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 18% (0.18x) speedup for _find_type_node in codeflash/languages/java/context.py

⏱️ Runtime : 25.3 microseconds 21.4 microseconds (best of 34 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 12% (0.12x) speedup for discover_functions_from_source in codeflash/languages/java/discovery.py

⏱️ Runtime : 21.3 milliseconds 19.0 milliseconds (best of 19 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

The eager import of JavaSupport in languages/__init__.py created a
circular import when merged with main's time_utils.py (which imports
from critic.py). Moved all language support imports to __getattr__ and
added Java to the lazy registration in registry.py.
@claude
Copy link
Contributor

claude bot commented Feb 20, 2026

PR Review Summary

Prek Checks

All prek checks passed — ruff check and ruff format both clean. No fixes needed.

Mypy

⚠️ 410 mypy errors across 32 files — This is a massive PR adding Java language support (205 files changed, ~24K lines). Most mypy errors are in the newly added codeflash/languages/java/ module. These are primarily union-attr (missing None checks on XML parsing), assignment (type mismatches with shutil.which), and arg-type issues. Fixing these would require logic changes and is out of scope per review policy.

Code Review

Critical Issues (still open from previous review):

  1. 🔴 CRITICAL: Argument mismatch in instrument_existing_tests.py:730
    The call to Java's instrument_existing_test() passes arguments in the wrong positional order:

    # Current (BROKEN):
    return instrument_existing_test(test_path, call_positions, function_to_optimize, tests_project_root, mode.value)
    
    # Expected signature: (test_string, function_to_optimize, mode, test_path=None, test_class_name=None)

    This maps test_pathtest_string, call_positionsfunction_to_optimize, etc. Will cause a runtime crash when instrumenting Java tests.

  2. 🟡 LOW: Java missing from registry.py fallback_ensure_languages_registered() only imports Python and JavaScript. Java IS registered via __init__.py:44 import, so this works in practice. Recommend adding Java import to the fallback for robustness.

No new critical issues found in the latest commits (f06acba..e2c3e98). Latest changes are clean: batch mode flags for Maven, macOS Java detection improvements, JUnit 4/TestNG support, and cache clearing fixes.

Test Results

  • 3221 passed, 35 failed, 58 skipped
  • Failures breakdown: 1 in test_comparator (Java test results table), 8 in test_tracer (Tracer API changes), and others related to Java/Maven integration (expected without Java/Maven installed in CI)

Test Coverage

File (NEW) Stmts Miss Cover Status
java/parser.py 338 5 99%
java/line_profiler.py 144 14 90%
java/context.py 465 52 89%
java/concurrency_analyzer.py 112 13 88%
java/discovery.py 104 13 88%
java/import_resolver.py 130 16 88%
java/remove_asserts.py 447 55 88%
java/config.py 235 35 85%
java/instrumentation.py 527 95 82%
java/formatter.py 143 51 64% ⚠️ Below 75%
java/support.py 179 67 63% ⚠️ Below 75%
java/replacement.py 330 132 60% ⚠️ Below 75%
java/build_tools.py 423 177 58% ⚠️ Below 75%
java/comparator.py 159 84 47% ⚠️ Below 75%
init_java.py 302 251 17% 🔴 Below 75%
File (MODIFIED) Stmts Miss Cover
languages/base.py 146 2 99%
languages/current.py 21 0 100%
python/context/code_context_extractor.py 637 48 92%
setup/config_schema.py 80 9 89%
code_utils/config_consts.py 59 7 88%
verification/equivalence.py 63 8 87%
models/models.py 627 134 79%
languages/registry.py 145 31 79%
python/static_analysis/code_replacer.py 421 96 77%
setup/detector.py 457 126 72%
javascript/instrument.py 632 179 72%
javascript/support.py 1006 289 71%
github/PrComment.py 34 10 71%
result/critic.py 90 27 70%
setup/config_writer.py 146 51 65%
code_utils/config_parser.py 137 54 61%

Overall coverage for changed files: 58%

  • 9 of 15 new Java files meet the 75% threshold ✅
  • 6 new files are below 75% — comparator.py (47%), build_tools.py (58%), replacement.py (60%), support.py (63%), formatter.py (64%), init_java.py (17%)

Last updated: 2026-02-20T20:35Z

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 217% (2.17x) speedup for get_java_formatter_cmd in codeflash/cli_cmds/init_java.py

⏱️ Runtime : 6.81 milliseconds 2.15 milliseconds (best of 112 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 22% (0.22x) speedup for check_formatter_installed in codeflash/code_utils/env_utils.py

⏱️ Runtime : 70.1 milliseconds 57.7 milliseconds (best of 59 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 28% (0.28x) speedup for PrComment.to_json in codeflash/github/PrComment.py

⏱️ Runtime : 5.96 milliseconds 4.64 milliseconds (best of 54 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

The optimized code achieves a **7426% speedup** (77.5ms → 1.03ms) by eliminating expensive exception handling for non-existent files. 

**Key Optimization:**

The line profiler reveals that 96.5% of the original runtime (193ms out of 200ms) was spent in `logger.warning()` calls within exception handlers. The code was attempting to read 165 non-existent helper files, catching `FileNotFoundError` exceptions, and then logging each failure.

The optimization adds an early `file_path.exists()` check before attempting to read files:

```python
# New guard clause
if not file_path.exists():
    continue
```

This prevents:
1. **Exception handling overhead**: No `try-except` block execution for missing files
2. **Expensive logging**: The `logger.warning()` call consumed 193ms across 165 failures
3. **File I/O attempts**: No need to even attempt opening non-existent files

The same defensive check is added to `_find_same_class_helpers` to prevent attempts to read from non-existent function file paths.

**Why This Matters:**

Based on the function references, `find_helper_functions` is called during:
- Test discovery and code context extraction (`test_integration.py`)
- Helper function analysis workflows (`test_context.py`)

Since the function processes helper files in a loop (181 iterations in the test), avoiding 165 expensive exception-handling cycles per invocation makes this optimization particularly impactful. The test results show this works best when dealing with:
- Many non-existent helper file paths (common in real projects where imports resolve to external dependencies)
- Deep dependency chains with missing files
- Scalability scenarios with 50-100+ helper files where some don't exist

The optimization maintains correctness—all test cases pass with identical output—while dramatically improving performance for the common case of encountering non-existent dependency files during Java code analysis.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 7,426% (74.26x) speedup for find_helper_functions in codeflash/languages/java/context.py

⏱️ Runtime : 77.5 milliseconds 1.03 milliseconds (best of 25 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

github-actions bot and others added 2 commits February 20, 2026 22:42
…2026-02-20T22.33.34

⚡️ Speed up function `find_helper_functions` by 7,426% in PR #1199 (`omni-java`)
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

This PR is now faster! 🚀 @misrasaurabh1 accepted my optimizations from:

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 83% (0.83x) speedup for _extract_test_method_name in codeflash/languages/java/instrumentation.py

⏱️ Runtime : 10.0 milliseconds 5.48 milliseconds (best of 78 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

def _byte_to_line_index(byte_offset: int, line_byte_starts: list[int]) -> int:
"""Map a byte offset in body_text to a body_lines index."""
idx = bisect.bisect_right(line_byte_starts, byte_offset) - 1
return max(idx, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 34% (0.34x) speedup for _byte_to_line_index in codeflash/languages/java/instrumentation.py

⏱️ Runtime : 1.07 milliseconds 800 microseconds (best of 228 runs)

📝 Explanation and details

This optimization achieves a 33% runtime improvement by replacing the max(idx, 0) function call with an inline conditional expression 0 if idx < 0 else idx.

Key Performance Improvements:

  1. Function Call Elimination: The max() built-in function, while optimized, still incurs function call overhead including argument passing, stack frame setup, and Python's internal dispatch mechanism. The conditional expression performs the same logic inline without any function call.

  2. Line Profiler Evidence: The return statement drops from 787μs to 426μs (46% reduction) - nearly half the execution time - while the bisect operation remains virtually unchanged. This clearly demonstrates the benefit comes from removing the max() call overhead.

  3. Branch Prediction Friendly: In typical usage, idx is negative only when byte_offset precedes all line starts (edge case). The conditional expression 0 if idx < 0 else idx has a predictable branch pattern that modern CPUs can optimize effectively.

Test Case Performance:

  • All test cases show consistent 30-67% speedups on the return statement
  • Edge cases with negative offsets still benefit (35-52% faster) despite taking the "unusual" branch
  • Large-scale tests with 1000+ lines show 31-50% improvements, confirming the optimization scales well

Why This Works:
Python's max() is implemented in C and highly optimized, but even that C-level function call has overhead compared to a direct conditional expression compiled into bytecode. For a hot-path function that maps byte offsets to line indices (likely called thousands of times during code instrumentation/analysis), eliminating even small per-call overheads compounds into significant total runtime savings.

The optimization maintains identical behavior and correctness across all test scenarios while delivering substantial performance gains through a simple, readable change.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2595 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
from codeflash.languages.java.instrumentation import _byte_to_line_index

def test_basic_positions_at_and_between_starts():
    # A typical list of line start byte offsets (sorted, zero-based)
    line_byte_starts = [0, 10, 20, 30]
    # Offsets exactly at the first byte of a line should map to that line's index
    codeflash_output = _byte_to_line_index(0, line_byte_starts) # 1.13μs -> 751ns (50.9% faster)
    codeflash_output = _byte_to_line_index(10, line_byte_starts) # 550ns -> 350ns (57.1% faster)
    codeflash_output = _byte_to_line_index(20, line_byte_starts) # 391ns -> 271ns (44.3% faster)
    codeflash_output = _byte_to_line_index(30, line_byte_starts) # 370ns -> 270ns (37.0% faster)
    # Offsets in-between starts map to the preceding line index
    codeflash_output = _byte_to_line_index(5, line_byte_starts) # 441ns -> 291ns (51.5% faster)
    codeflash_output = _byte_to_line_index(19, line_byte_starts) # 350ns -> 260ns (34.6% faster)
    codeflash_output = _byte_to_line_index(29, line_byte_starts) # 351ns -> 240ns (46.2% faster)
    # Offsets beyond the last start map to the last line index
    codeflash_output = _byte_to_line_index(100, line_byte_starts) # 330ns -> 240ns (37.5% faster)

def test_empty_line_byte_starts_returns_zero_for_any_offset():
    # An empty list of starts should always produce 0 (clamped by max(idx, 0))
    empty = []
    for offset in (-1000, -1, 0, 1, 999999):
        codeflash_output = _byte_to_line_index(offset, empty) # 2.30μs -> 1.51μs (52.3% faster)

def test_single_element_list_behaviour_always_zero():
    # With a single entry in line_byte_starts the function is defined to return at least 0.
    # For [5], bisect_right will at best return 1 so idx becomes 0; for smaller values idx becomes -1 -> clamped to 0
    single = [5]
    for offset in (-10, 0, 4, 5, 6, 100):
        codeflash_output = _byte_to_line_index(offset, single) # 2.91μs -> 1.99μs (46.0% faster)

def test_negative_offsets_map_to_zero_when_before_first_start():
    # Offsets before the first start must map to index 0 (cannot go negative)
    starts = [0, 10, 20]
    codeflash_output = _byte_to_line_index(-1, starts) # 1.05μs -> 691ns (52.2% faster)
    codeflash_output = _byte_to_line_index(-9999, starts) # 450ns -> 270ns (66.7% faster)

def test_float_offset_is_handled_like_numeric_comparison():
    # Although signature is int, a float offset should be compared numerically and produce the expected line index
    starts = [0, 10, 20]
    # 10.5 lies between 10 and 20 -> should map to index 1
    codeflash_output = _byte_to_line_index(10.5, starts) # 1.16μs -> 862ns (34.9% faster)

def test_bool_works_as_subclass_of_int():
    # bool is a subclass of int in Python: True == 1
    starts = [0, 1, 2]
    codeflash_output = _byte_to_line_index(True, starts) # 1.07μs -> 681ns (57.4% faster)

def test_unsorted_line_byte_starts_behaves_like_bisect_right_on_given_sequence():
    # The function relies on bisect; passing an unsorted list is incorrect usage but deterministic:
    # we assert the behavior is consistent with bisect_right-based logic (observed result).
    unsorted = [10, 0, 20]
    # For offset 5, bisect_right on this sequence yields insertion position 2 -> idx = 1
    # So the function should return 1 (this documents current behavior and will catch regressions)
    codeflash_output = _byte_to_line_index(5, unsorted) # 1.01μs -> 702ns (44.2% faster)

def test_none_inputs_raise_type_error():
    # Passing None for byte_offset should raise a TypeError when bisect tries to compare
    with pytest.raises(TypeError):
        _byte_to_line_index(None, [0, 10, 20]) # 3.00μs -> 2.94μs (2.04% faster)
    # Passing None for line_byte_starts is invalid (must be a sequence) and should raise TypeError
    with pytest.raises(TypeError):
        _byte_to_line_index(5, None) # 1.45μs -> 1.46μs (0.684% slower)

def test_incomparable_elements_in_list_raise_type_error():
    # If list contains elements that cannot be compared to the offset (e.g., strings vs ints),
    # bisect will raise a TypeError; we assert that this error is raised to lock in behavior.
    bad_list = [0, "a", 20]
    with pytest.raises(TypeError):
        _byte_to_line_index(1, bad_list) # 2.52μs -> 2.48μs (1.65% faster)

def test_large_scale_sequential_starts_midpoints():
    # Construct 1000 line starts, each 10 bytes apart: 0, 10, 20, ..., 9990
    n = 1000
    stride = 10
    starts = [i * stride for i in range(n)]
    # For each i, choose an offset halfway to the next start (i*10 + 5) and expect index i
    for i in range(n):
        offset = i * stride + (stride // 2)  # deterministic midpoint
        codeflash_output = _byte_to_line_index(offset, starts) # 408μs -> 306μs (33.1% faster)

def test_large_scale_boundaries_and_beyond():
    # Same large list, test exact boundaries and offsets beyond the last start
    n = 1000
    stride = 10
    starts = [i * stride for i in range(n)]
    # Exact boundaries should map to their corresponding indices
    for i in range(n):
        offset = i * stride
        codeflash_output = _byte_to_line_index(offset, starts) # 412μs -> 314μs (31.2% faster)
    # Offset beyond the last start maps to the last index
    beyond = starts[-1] + 12345
    codeflash_output = _byte_to_line_index(beyond, starts) # 420ns -> 320ns (31.2% faster)
    # Offset well before the first start maps to 0
    codeflash_output = _byte_to_line_index(-999999, starts) # 571ns -> 451ns (26.6% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import bisect

# imports
import pytest
from codeflash.languages.java.instrumentation import _byte_to_line_index

def test_basic_single_line():
    """Test with a single line starting at byte 0."""
    # When line_byte_starts has one element [0], any offset maps to line 0
    codeflash_output = _byte_to_line_index(5, [0]); result = codeflash_output # 1.24μs -> 851ns (45.9% faster)

def test_basic_multiple_lines_first_line():
    """Test byte offset in the first line."""
    # With line starts at [0, 10, 20], offset 5 should map to line 0
    codeflash_output = _byte_to_line_index(5, [0, 10, 20]); result = codeflash_output # 1.14μs -> 751ns (52.1% faster)

def test_basic_multiple_lines_second_line():
    """Test byte offset in the second line."""
    # With line starts at [0, 10, 20], offset 15 should map to line 1
    codeflash_output = _byte_to_line_index(15, [0, 10, 20]); result = codeflash_output # 1.03μs -> 711ns (45.1% faster)

def test_basic_multiple_lines_third_line():
    """Test byte offset in the third line."""
    # With line starts at [0, 10, 20], offset 25 should map to line 2
    codeflash_output = _byte_to_line_index(25, [0, 10, 20]); result = codeflash_output # 1.02μs -> 701ns (45.8% faster)

def test_basic_exact_line_boundary():
    """Test byte offset exactly at a line boundary."""
    # With line starts at [0, 10, 20], offset 10 should map to line 1
    codeflash_output = _byte_to_line_index(10, [0, 10, 20]); result = codeflash_output # 1.10μs -> 671ns (64.2% faster)

def test_basic_offset_zero():
    """Test with byte offset of 0."""
    # Offset 0 should always map to line 0
    codeflash_output = _byte_to_line_index(0, [0, 10, 20]); result = codeflash_output # 1.11μs -> 721ns (54.2% faster)

def test_basic_large_offset():
    """Test with byte offset beyond all line starts."""
    # With line starts at [0, 10, 20], offset 100 should map to last line (2)
    codeflash_output = _byte_to_line_index(100, [0, 10, 20]); result = codeflash_output # 1.03μs -> 651ns (58.5% faster)

def test_edge_empty_line_starts():
    """Test with empty line_byte_starts list."""
    # Empty list should return 0 due to max(idx, 0) where idx = -1
    codeflash_output = _byte_to_line_index(5, []); result = codeflash_output # 921ns -> 601ns (53.2% faster)

def test_edge_empty_line_starts_zero_offset():
    """Test with empty line_byte_starts and zero offset."""
    # Empty list with offset 0 should still return 0
    codeflash_output = _byte_to_line_index(0, []); result = codeflash_output # 891ns -> 611ns (45.8% faster)

def test_edge_negative_byte_offset():
    """Test with negative byte offset."""
    # Negative offset should return 0 due to max(idx, 0)
    codeflash_output = _byte_to_line_index(-5, [0, 10, 20]); result = codeflash_output # 991ns -> 731ns (35.6% faster)

def test_edge_large_negative_offset():
    """Test with large negative byte offset."""
    # Large negative offset should still return 0
    codeflash_output = _byte_to_line_index(-1000, [0, 10, 20]); result = codeflash_output # 1.00μs -> 742ns (35.0% faster)

def test_edge_line_starts_single_nonzero():
    """Test with line_byte_starts containing only one non-zero value."""
    # With [5], offset 3 should map to line 0, offset 5 should map to line 1
    codeflash_output = _byte_to_line_index(3, [5]); result1 = codeflash_output # 901ns -> 671ns (34.3% faster)
    codeflash_output = _byte_to_line_index(5, [5]); result2 = codeflash_output # 581ns -> 351ns (65.5% faster)

def test_edge_line_starts_not_starting_at_zero():
    """Test with line_byte_starts that doesn't start at 0."""
    # With [5, 15, 25], offset 3 should map to line 0
    codeflash_output = _byte_to_line_index(3, [5, 15, 25]); result = codeflash_output # 992ns -> 731ns (35.7% faster)

def test_edge_consecutive_line_boundaries():
    """Test offsets exactly at consecutive line boundaries."""
    line_starts = [0, 10, 20, 30, 40]
    # Test each boundary
    codeflash_output = _byte_to_line_index(0, line_starts) # 1.12μs -> 762ns (47.2% faster)
    codeflash_output = _byte_to_line_index(10, line_starts) # 581ns -> 361ns (60.9% faster)
    codeflash_output = _byte_to_line_index(20, line_starts) # 421ns -> 311ns (35.4% faster)
    codeflash_output = _byte_to_line_index(30, line_starts) # 411ns -> 261ns (57.5% faster)
    codeflash_output = _byte_to_line_index(40, line_starts) # 361ns -> 240ns (50.4% faster)

def test_edge_very_large_byte_offset():
    """Test with very large byte offset."""
    # Very large offset should map to the last line
    codeflash_output = _byte_to_line_index(1000000, [0, 10, 20]); result = codeflash_output # 1.03μs -> 711ns (45.0% faster)

def test_edge_identical_line_starts():
    """Test with duplicate values in line_byte_starts."""
    # With [0, 10, 10, 20], offset 10 should map to the appropriate line
    codeflash_output = _byte_to_line_index(10, [0, 10, 10, 20]); result = codeflash_output # 1.02μs -> 742ns (37.7% faster)

def test_edge_unsorted_line_starts_behavior():
    """Test function behavior with unsorted line_byte_starts."""
    # Note: function assumes sorted input, but test what happens
    # bisect_right on [0, 20, 10] with offset 15 will behave unexpectedly
    # but function still returns max(idx, 0)
    codeflash_output = _byte_to_line_index(15, [0, 20, 10]); result = codeflash_output # 1.07μs -> 722ns (48.5% faster)

def test_edge_offset_between_lines():
    """Test byte offset exactly between two line starts."""
    # With [0, 10, 20], offset 9 is between 0 and 10
    codeflash_output = _byte_to_line_index(9, [0, 10, 20]); result = codeflash_output # 1.03μs -> 681ns (51.5% faster)

def test_edge_offset_just_before_boundary():
    """Test byte offset just before a line boundary."""
    # With [0, 10, 20], offset 9 is just before line 1
    codeflash_output = _byte_to_line_index(9, [0, 10, 20]); result = codeflash_output # 982ns -> 731ns (34.3% faster)

def test_edge_offset_just_after_boundary():
    """Test byte offset just after a line boundary."""
    # With [0, 10, 20], offset 11 is just after line 1 starts
    codeflash_output = _byte_to_line_index(11, [0, 10, 20]); result = codeflash_output # 1.00μs -> 731ns (37.1% faster)

def test_edge_two_element_list():
    """Test with minimal multi-element list."""
    # With [0, 10], offset 5 should map to line 0, offset 15 to line 1
    codeflash_output = _byte_to_line_index(5, [0, 10]) # 1.05μs -> 711ns (48.0% faster)
    codeflash_output = _byte_to_line_index(15, [0, 10]) # 551ns -> 330ns (67.0% faster)

def test_large_scale_many_lines():
    """Test with a large number of lines (1000 lines)."""
    # Create line_byte_starts for 1000 lines, each 100 bytes apart
    line_byte_starts = [i * 100 for i in range(1000)]
    
    # Test offset in the first line
    codeflash_output = _byte_to_line_index(50, line_byte_starts) # 1.26μs -> 902ns (39.9% faster)
    
    # Test offset in the middle
    codeflash_output = _byte_to_line_index(50000, line_byte_starts) # 741ns -> 491ns (50.9% faster)
    
    # Test offset in the last line
    codeflash_output = _byte_to_line_index(99950, line_byte_starts) # 501ns -> 380ns (31.8% faster)

def test_large_scale_very_large_offsets():
    """Test with very large byte offsets across many lines."""
    # Create line_byte_starts with 100 lines
    line_byte_starts = [i * 10000 for i in range(100)]
    
    # Test offset beyond all line starts
    codeflash_output = _byte_to_line_index(1000000, line_byte_starts); result = codeflash_output # 1.08μs -> 771ns (40.3% faster)

def test_large_scale_dense_line_starts():
    """Test with densely packed line starts."""
    # Create 1000 line starts with small gaps
    line_byte_starts = list(range(1000))
    
    # Test various offsets
    codeflash_output = _byte_to_line_index(0, line_byte_starts) # 1.19μs -> 882ns (35.1% faster)
    codeflash_output = _byte_to_line_index(500, line_byte_starts) # 702ns -> 521ns (34.7% faster)
    codeflash_output = _byte_to_line_index(999, line_byte_starts) # 500ns -> 381ns (31.2% faster)

def test_large_scale_sparse_line_starts():
    """Test with sparsely packed line starts."""
    # Create 100 line starts with large gaps
    line_byte_starts = [i * 1000000 for i in range(100)]
    
    # Test various offsets
    codeflash_output = _byte_to_line_index(50000, line_byte_starts) # 1.17μs -> 772ns (51.8% faster)
    codeflash_output = _byte_to_line_index(500000, line_byte_starts) # 551ns -> 341ns (61.6% faster)
    codeflash_output = _byte_to_line_index(1000000, line_byte_starts) # 471ns -> 340ns (38.5% faster)
    codeflash_output = _byte_to_line_index(50000000, line_byte_starts) # 421ns -> 301ns (39.9% faster)

def test_large_scale_mixed_gap_sizes():
    """Test with mixed gap sizes between line starts."""
    # Create line starts with varying gaps
    line_byte_starts = [0, 1, 100, 200, 1000, 5000, 10000]
    
    # Test various offsets throughout
    codeflash_output = _byte_to_line_index(0, line_byte_starts) # 1.08μs -> 792ns (36.6% faster)
    codeflash_output = _byte_to_line_index(50, line_byte_starts) # 561ns -> 381ns (47.2% faster)
    codeflash_output = _byte_to_line_index(150, line_byte_starts) # 390ns -> 280ns (39.3% faster)
    codeflash_output = _byte_to_line_index(2000, line_byte_starts) # 351ns -> 240ns (46.2% faster)
    codeflash_output = _byte_to_line_index(7000, line_byte_starts) # 350ns -> 250ns (40.0% faster)

def test_large_scale_sequential_lookups():
    """Test sequential byte lookups across a large range."""
    # Create line_byte_starts for a realistic scenario
    line_byte_starts = [i * 50 for i in range(100)]
    
    # Perform sequential lookups and verify consistency
    for offset in range(0, 5000, 10):
        codeflash_output = _byte_to_line_index(offset, line_byte_starts); result = codeflash_output # 177μs -> 128μs (38.8% faster)

def test_large_scale_boundary_lookups():
    """Test lookups exactly at many boundaries."""
    # Create line_byte_starts
    line_byte_starts = [i * 100 for i in range(100)]
    
    # Test at every 10th boundary
    for i in range(0, 100, 10):
        codeflash_output = _byte_to_line_index(line_byte_starts[i], line_byte_starts); result = codeflash_output # 5.13μs -> 3.69μs (38.8% faster)

def test_large_scale_random_large_offsets():
    """Test with randomly distributed large offsets."""
    # Create line_byte_starts for 200 lines
    line_byte_starts = [i * 5000 for i in range(200)]
    
    # Test various random-like offsets
    test_offsets = [0, 100, 5000, 50000, 500000, 999999, 1000000]
    for offset in test_offsets:
        codeflash_output = _byte_to_line_index(offset, line_byte_starts); result = codeflash_output # 3.80μs -> 2.81μs (35.4% faster)

def test_large_scale_performance_consistency():
    """Test that function returns consistent results with large inputs."""
    # Create a large line_byte_starts list
    line_byte_starts = [i * 1000 for i in range(500)]
    
    # Call function multiple times with same input
    offset = 250000
    codeflash_output = _byte_to_line_index(offset, line_byte_starts); result1 = codeflash_output # 1.15μs -> 791ns (45.8% faster)
    codeflash_output = _byte_to_line_index(offset, line_byte_starts); result2 = codeflash_output # 510ns -> 330ns (54.5% faster)
    codeflash_output = _byte_to_line_index(offset, line_byte_starts); result3 = codeflash_output # 391ns -> 280ns (39.6% faster)

def test_large_scale_return_type_consistency():
    """Test that return type is always int across large inputs."""
    # Create various large line_byte_starts lists
    test_cases = [
        ([0, 1000, 2000], 500),
        ([i * 100 for i in range(100)], 5000),
        ([i * 10000 for i in range(50)], 250000),
    ]
    
    for line_starts, offset in test_cases:
        codeflash_output = _byte_to_line_index(offset, line_starts); result = codeflash_output # 2.05μs -> 1.41μs (45.6% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-20T23.28.24

Suggested change
return max(idx, 0)
return 0 if idx < 0 else idx

Static Badge

Comment on lines +350 to +361
if hasattr(func, "qualified_name"):
return str(func.qualified_name)
# Build qualified name from function_name and parents
if hasattr(func, "function_name"):
parts = []
if hasattr(func, "parents") and func.parents:
for parent in func.parents:
if hasattr(parent, "name"):
parts.append(parent.name)
parts.append(func.function_name)
return ".".join(parts)
return str(func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 24% (0.24x) speedup for _get_qualified_name in codeflash/languages/java/instrumentation.py

⏱️ Runtime : 47.1 milliseconds 38.0 milliseconds (best of 108 runs)

📝 Explanation and details

The optimized code achieves a 24% runtime improvement by replacing defensive hasattr() checks with faster try/except blocks in the _get_qualified_name function.

Key optimization:
The original code uses hasattr() extensively to check for attributes before accessing them. Each hasattr() call internally performs a getattr() and catches exceptions, making it essentially two attribute lookups. The optimized version eliminates this overhead by using EAFP (Easier to Ask for Forgiveness than Permission) - directly attempting attribute access and catching AttributeError only when it occurs.

Specific changes:

  1. Replaced hasattr(func, "qualified_name") check with direct access func.qualified_name in a try/except block
  2. Replaced hasattr(func, "function_name") check with direct access in a nested try/except
  3. Replaced hasattr(func, "parents") check with getattr(func, "parents", None) - a single efficient lookup
  4. Replaced hasattr(parent, "name") inside loop with try/except around parent.name access

Why it's faster:

  • In the common case where attributes exist, we perform one attribute lookup instead of two (hasattr + actual access)
  • Line profiler shows dramatic improvements: the parent loop iterations dropped from ~238ms total time to ~143ms (40% faster for that section)
  • The hasattr() calls alone took 115ms+123ms=238ms in the original; this overhead is eliminated

Test case performance:

  • Best speedups occur with objects that have the expected attributes (17-27% faster), where the optimization avoids redundant lookups
  • Cases with large parent chains see the most benefit (26-31% faster for 500-1000 parents) as the loop optimization compounds
  • Edge cases with missing attributes are slower (20-77% slower) because exception handling has overhead, but these are uncommon paths in real usage

The __str__ method also changed from a list comprehension to a generator expression, but this has minimal impact compared to the try/except optimization.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2743 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from types import \
    ModuleType  # real, built-in class we can instantiate and attach attributes to

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.instrumentation import _get_qualified_name

def test_uses_qualified_name_when_present():
    # Create a real ModuleType instance and set a qualified_name attribute (non-string allowed)
    mod = ModuleType("mod_with_qualified")
    mod.qualified_name = "com.example.MyFunc"  # common expected use: a fully-qualified name string
    # The function should return the string form of the qualified_name attribute directly
    codeflash_output = _get_qualified_name(mod) # 601ns -> 511ns (17.6% faster)

def test_function_name_with_multiple_parents_joined_by_dots():
    # Create parent modules (real ModuleType instances) and give them 'name' attributes
    p1 = ModuleType("parent1")
    p1.name = "ParentA"
    p2 = ModuleType("parent2")
    p2.name = "ParentB"
    # Create the function-like object with function_name and parents attributes
    func_like = ModuleType("func_like")
    func_like.function_name = "doWork"
    func_like.parents = [p1, p2]  # ordered parents should appear in same order
    # Expect "ParentA.ParentB.doWork"
    codeflash_output = _get_qualified_name(func_like) # 4.06μs -> 4.48μs (9.40% slower)

def test_function_name_without_parents_or_empty_parents():
    # Case 1: parents attribute absent
    no_parents = ModuleType("no_parents")
    no_parents.function_name = "solo"
    # Since there is no 'qualified_name' and no 'parents', expect just the function_name
    codeflash_output = _get_qualified_name(no_parents) # 3.92μs -> 4.94μs (20.7% slower)

    # Case 2: parents attribute present but empty list
    empty_parents = ModuleType("empty_parents")
    empty_parents.function_name = "solo"
    empty_parents.parents = []
    # Empty parents should not add any prefix; still just the function_name
    codeflash_output = _get_qualified_name(empty_parents) # 1.74μs -> 2.02μs (13.9% slower)

def test_qualified_name_takes_precedence_over_function_name():
    # When both qualified_name and function_name exist, qualified_name should be used
    both = ModuleType("both")
    both.qualified_name = "priority.Name"
    both.function_name = "ignored"
    # Confirm precedence
    codeflash_output = _get_qualified_name(both) # 601ns -> 481ns (24.9% faster)

def test_qualified_name_converted_to_string_when_not_string_type():
    # If qualified_name is a non-string (e.g., integer), the function should cast it to str
    mod = ModuleType("mod_non_string_qn")
    mod.qualified_name = 12345  # integer value
    codeflash_output = _get_qualified_name(mod) # 691ns -> 611ns (13.1% faster)

def test_parents_with_missing_name_attributes_are_skipped():
    # Parents list may contain some objects without a 'name' attribute; those should be skipped
    p_with_name = ModuleType("p_with_name")
    p_with_name.name = "HasName"
    p_without_name = ModuleType("p_without_name")
    # Do NOT set p_without_name.name so hasattr(...) is False for that parent
    func_like = ModuleType("func_mixed_parents")
    func_like.function_name = "fn"
    func_like.parents = [p_with_name, p_without_name]
    # Only the parent with a 'name' attribute should be included
    codeflash_output = _get_qualified_name(func_like) # 4.75μs -> 6.41μs (25.9% slower)

def test_parents_with_various_types_including_none_and_int_are_handled():
    # Parents may contain heterogeneous items; only items with attribute 'name' should be considered
    p = ModuleType("p")
    p.name = "ValidParent"
    # Add None and integer values which lack 'name'
    func_like = ModuleType("func_mixed_types")
    func_like.function_name = "fn"
    func_like.parents = [None, 42, p]
    # Only "ValidParent" should be included
    codeflash_output = _get_qualified_name(func_like) # 3.44μs -> 5.81μs (40.9% slower)

def test_empty_function_name_results_in_trailing_dot_when_parents_present():
    # If function_name is an empty string, it should be appended (resulting in a trailing dot)
    p = ModuleType("p")
    p.name = "ParentOnly"
    func_like = ModuleType("func_empty_name")
    func_like.function_name = ""  # empty function name
    func_like.parents = [p]
    # Joining ["ParentOnly", ""] results in "ParentOnly."
    codeflash_output = _get_qualified_name(func_like) # 3.27μs -> 3.78μs (13.5% slower)

def test_fallback_to_str_for_objects_without_expected_attributes():
    # Passing an integer (no attributes) should return str(integer)
    codeflash_output = _get_qualified_name(7) # 611ns -> 2.60μs (76.5% slower)

    # A module with no function_name/qualified_name should fall back to str(module)
    mod = ModuleType("plain_module")
    # ModuleType.__str__ typically returns "<module 'name'>", so we check that that's the returned string
    codeflash_output = _get_qualified_name(mod) # 8.17μs -> 8.46μs (3.43% slower)

def test_large_number_of_parents_construction_and_output_correctness():
    # Construct 1000 parent ModuleType instances, each with a unique name attribute
    num_parents = 1000
    parents = []
    for i in range(num_parents):
        # Create a real ModuleType instance for each parent and assign a deterministic name
        p = ModuleType(f"parent_module_{i}")
        p.name = f"p{i}"
        parents.append(p)

    # Create the function-like object referencing all those parents
    big = ModuleType("big_func")
    big.function_name = "theFunction"
    big.parents = parents

    # Build the expected qualified name by joining all parent names and appending the function name
    expected = ".".join([f"p{i}" for i in range(num_parents)] + ["theFunction"])

    # Ensure the function returns exactly the expected long dotted name
    codeflash_output = _get_qualified_name(big) # 95.2μs -> 75.5μs (26.1% faster)

def test_repeated_calls_with_large_parents_are_deterministic_and_performant():
    # Reuse the large parents list from previous test but keep size moderate to ensure test remains quick
    num_parents = 500
    parents = []
    for i in range(num_parents):
        p = ModuleType(f"parent_mod_{i}")
        p.name = f"node{i}"
        parents.append(p)

    func_like = ModuleType("repeated_calls")
    func_like.function_name = "fn"
    func_like.parents = parents

    expected = ".".join([f"node{i}" for i in range(num_parents)] + ["fn"])

    # Call _get_qualified_name in a loop to ensure stability across multiple invocations
    for _ in range(1000):  # 1000 iterations as requested in the problem description
        codeflash_output = _get_qualified_name(func_like) # 46.3ms -> 36.2ms (27.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Any

# imports
import pytest
from codeflash.languages.java.instrumentation import _get_qualified_name

# Test data classes - using real classes, not mocks
class Parent:
    """A real parent class for testing qualified name construction."""
    def __init__(self, name: str):
        self.name = name

class FunctionToOptimizeWithQualifiedName:
    """Real class with qualified_name attribute."""
    def __init__(self, qualified_name: str):
        self.qualified_name = qualified_name

class FunctionToOptimizeWithFunctionName:
    """Real class with function_name attribute."""
    def __init__(self, function_name: str, parents: list = None):
        self.function_name = function_name
        self.parents = parents if parents is not None else []

class FunctionToOptimizeComplete:
    """Real class with all attributes for full testing."""
    def __init__(self, qualified_name: str, function_name: str, parents: list = None):
        self.qualified_name = qualified_name
        self.function_name = function_name
        self.parents = parents if parents is not None else []

class TestGetQualifiedNameBasic:
    """Basic tests for normal usage patterns."""

    def test_qualified_name_attribute_present(self):
        """Test that qualified_name attribute takes precedence when present."""
        # Create an object with qualified_name attribute
        obj = FunctionToOptimizeWithQualifiedName("com.example.MyClass.method")
        # Call the function and verify it returns the qualified_name as string
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 601ns -> 601ns (0.000% faster)

    def test_qualified_name_as_integer(self):
        """Test that qualified_name is converted to string even if numeric."""
        # Create an object with numeric qualified_name
        obj = FunctionToOptimizeWithQualifiedName(12345)
        # Call the function and verify it returns string representation
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 661ns -> 641ns (3.12% faster)

    def test_function_name_only_no_parents(self):
        """Test function_name is used when no parents exist."""
        # Create an object with function_name but no parents
        obj = FunctionToOptimizeWithFunctionName("methodName", parents=[])
        # Call the function and verify it returns just the function name
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.06μs -> 2.25μs (52.9% slower)

    def test_function_name_with_single_parent(self):
        """Test qualified name construction with one parent."""
        # Create a parent object
        parent = Parent("ParentClass")
        # Create a function object with one parent
        obj = FunctionToOptimizeWithFunctionName("method", parents=[parent])
        # Call the function and verify it constructs qualified name correctly
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.52μs -> 2.53μs (39.9% slower)

    def test_function_name_with_multiple_parents(self):
        """Test qualified name construction with multiple nested parents."""
        # Create a chain of parents
        parent1 = Parent("OuterClass")
        parent2 = Parent("InnerClass")
        # Create a function object with multiple parents
        obj = FunctionToOptimizeWithFunctionName("method", parents=[parent1, parent2])
        # Call the function and verify all parent names are joined
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.64μs -> 2.52μs (34.7% slower)

    def test_fallback_to_string_representation(self):
        """Test fallback to str() when no qualified_name or function_name."""
        # Create a simple object without special attributes
        obj = "simple_string"
        # Call the function and verify it returns string representation
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 561ns -> 2.42μs (76.9% slower)

    def test_fallback_with_custom_str(self):
        """Test fallback uses custom __str__ method when available."""
        # Create a custom class with __str__ method
        class CustomObject:
            def __str__(self):
                return "CustomStringRepresentation"
        
        obj = CustomObject()
        # Call the function and verify it uses the custom __str__
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.14μs -> 3.25μs (64.9% slower)

    def test_qualified_name_takes_precedence_over_function_name(self):
        """Test that qualified_name is used even if function_name also exists."""
        # Create object with both attributes
        obj = FunctionToOptimizeComplete(
            qualified_name="full.qualified.Name",
            function_name="ignored_method",
            parents=[]
        )
        # Call the function and verify qualified_name is used
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 551ns -> 491ns (12.2% faster)

class TestGetQualifiedNameEdgeCases:
    """Edge case tests for unusual or boundary conditions."""

    def test_empty_qualified_name(self):
        """Test with empty string as qualified_name."""
        # Create an object with empty qualified_name
        obj = FunctionToOptimizeWithQualifiedName("")
        # Call the function and verify it returns empty string
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 501ns -> 441ns (13.6% faster)

    def test_empty_function_name(self):
        """Test with empty string as function_name."""
        # Create an object with empty function_name
        obj = FunctionToOptimizeWithFunctionName("")
        # Call the function and verify it returns empty string
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.11μs -> 2.23μs (50.2% slower)

    def test_empty_parents_list(self):
        """Test when parents attribute is an empty list."""
        # Create an object with explicitly empty parents list
        obj = FunctionToOptimizeWithFunctionName("method", parents=[])
        # Call the function and verify it returns just the function name
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.00μs -> 1.96μs (49.0% slower)

    def test_none_parents_attribute(self):
        """Test when parents attribute is None (falsy)."""
        # Create an object where parents is None
        obj = FunctionToOptimizeWithFunctionName("method", parents=None)
        # Call the function and verify it returns just the function name
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 881ns -> 1.96μs (55.1% slower)

    def test_parent_without_name_attribute(self):
        """Test parent that lacks a name attribute."""
        # Create a parent object without name attribute
        class ParentWithoutName:
            pass
        
        parent = ParentWithoutName()
        # Create a function object with this parent
        obj = FunctionToOptimizeWithFunctionName("method", parents=[parent])
        # Call the function and verify parent is skipped
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.47μs -> 4.05μs (63.6% slower)

    def test_mixed_parents_some_with_name_some_without(self):
        """Test with multiple parents where some have name and some don't."""
        # Create parents with and without name attribute
        parent_with_name = Parent("HasName")
        class ParentWithoutName:
            pass
        parent_without_name = ParentWithoutName()
        
        # Create a function with mixed parents
        obj = FunctionToOptimizeWithFunctionName(
            "method",
            parents=[parent_with_name, parent_without_name]
        )
        # Call the function and verify only parents with names are included
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.82μs -> 4.21μs (56.7% slower)

    def test_qualified_name_with_special_characters(self):
        """Test qualified_name containing special characters."""
        # Create an object with special characters in qualified_name
        obj = FunctionToOptimizeWithQualifiedName("com.example$Inner#method@v1")
        # Call the function and verify special characters are preserved
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 501ns -> 491ns (2.04% faster)

    def test_function_name_with_unicode(self):
        """Test function_name with unicode characters."""
        # Create an object with unicode in function_name
        obj = FunctionToOptimizeWithFunctionName("méthod_名前", parents=[])
        # Call the function and verify unicode is preserved
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.01μs -> 2.08μs (51.4% slower)

    def test_parent_name_with_special_characters(self):
        """Test parent names containing special characters and numbers."""
        # Create a parent with special characters in name
        parent = Parent("Class$1#Inner_2")
        # Create a function object
        obj = FunctionToOptimizeWithFunctionName("method", parents=[parent])
        # Call the function and verify special characters are preserved
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.48μs -> 2.48μs (40.1% slower)

    def test_integer_as_input(self):
        """Test with integer passed directly."""
        # Pass an integer to the function
        codeflash_output = _get_qualified_name(42); result = codeflash_output # 631ns -> 2.54μs (75.2% slower)

    def test_none_as_input(self):
        """Test with None passed directly."""
        # Pass None to the function
        codeflash_output = _get_qualified_name(None); result = codeflash_output # 832ns -> 2.45μs (66.1% slower)

    def test_list_as_input(self):
        """Test with list passed directly."""
        # Pass a list to the function
        test_list = [1, 2, 3]
        codeflash_output = _get_qualified_name(test_list); result = codeflash_output # 1.72μs -> 3.44μs (49.9% slower)

    def test_dict_as_input(self):
        """Test with dict passed directly."""
        # Pass a dict to the function
        test_dict = {"key": "value"}
        codeflash_output = _get_qualified_name(test_dict); result = codeflash_output # 1.84μs -> 3.49μs (47.1% slower)

    def test_boolean_qualified_name(self):
        """Test with boolean value as qualified_name."""
        # Create an object with boolean as qualified_name
        obj = FunctionToOptimizeWithQualifiedName(True)
        # Call the function and verify it converts to string
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 551ns -> 510ns (8.04% faster)

    def test_float_qualified_name(self):
        """Test with float value as qualified_name."""
        # Create an object with float as qualified_name
        obj = FunctionToOptimizeWithQualifiedName(3.14)
        # Call the function and verify it converts to string
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.47μs -> 1.51μs (2.64% slower)

    def test_qualified_name_very_long_string(self):
        """Test with very long qualified_name string."""
        # Create a very long qualified name
        long_name = "com." + ".".join(["package"] * 100) + ".ClassName.method"
        obj = FunctionToOptimizeWithQualifiedName(long_name)
        # Call the function and verify it handles long strings
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 501ns -> 501ns (0.000% faster)

    def test_function_name_with_dots(self):
        """Test function_name that already contains dots."""
        # Create a function name with dots in it
        obj = FunctionToOptimizeWithFunctionName("method.inner.nested", parents=[])
        # Call the function and verify dots are preserved
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.12μs -> 2.20μs (49.1% slower)

    def test_many_parents_chain(self):
        """Test with a long chain of parent classes."""
        # Create a chain of 10 parents
        parents = [Parent(f"Class{i}") for i in range(10)]
        obj = FunctionToOptimizeWithFunctionName("method", parents=parents)
        # Call the function and verify all parents are joined
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 2.39μs -> 3.50μs (31.5% slower)
        expected = "Class0.Class1.Class2.Class3.Class4.Class5.Class6.Class7.Class8.Class9.method"

class TestGetQualifiedNameLargeScale:
    """Large-scale tests for performance and scalability."""

    def test_many_parents_100(self):
        """Test with 100 parent classes in the hierarchy."""
        # Create 100 parent objects
        parents = [Parent(f"ParentClass{i:03d}") for i in range(100)]
        obj = FunctionToOptimizeWithFunctionName("finalMethod", parents=parents)
        # Call the function and verify it handles large parent chains
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 9.56μs -> 9.18μs (4.14% faster)

    def test_deeply_nested_qualified_name(self):
        """Test with deeply nested qualified name (1000+ segments)."""
        # Create a qualified name with 1000 segments
        segments = ["segment"] * 1000
        deep_name = ".".join(segments)
        obj = FunctionToOptimizeWithQualifiedName(deep_name)
        # Call the function and verify it handles deep nesting
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 501ns -> 491ns (2.04% faster)

    def test_processing_100_different_objects(self):
        """Test processing 100 different function objects."""
        # Create 100 different function objects
        results = []
        for i in range(100):
            parent = Parent(f"Parent{i}")
            obj = FunctionToOptimizeWithFunctionName(f"method{i}", parents=[parent])
            # Process each object
            codeflash_output = _get_qualified_name(obj); result = codeflash_output # 49.6μs -> 94.0μs (47.2% slower)
            results.append(result)

    def test_large_parent_name_strings(self):
        """Test with very long parent names."""
        # Create a parent with a very long name
        long_parent_name = "VeryLongClassName" * 100  # ~1700 characters
        parent = Parent(long_parent_name)
        obj = FunctionToOptimizeWithFunctionName("method", parents=[parent])
        # Call the function and verify it handles long names
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 1.67μs -> 2.60μs (35.8% slower)

    def test_many_qualified_names_processing(self):
        """Test processing 500 objects with different qualified names."""
        # Create 500 objects with different qualified names
        results = []
        for i in range(500):
            qualified_name = f"package.subpackage{i}.Class{i % 10}.method{i}"
            obj = FunctionToOptimizeWithQualifiedName(qualified_name)
            # Process each object
            codeflash_output = _get_qualified_name(obj); result = codeflash_output # 92.9μs -> 80.7μs (15.2% faster)
            results.append(result)

    def test_mixed_fallback_strings_1000_items(self):
        """Test fallback string conversion on 1000 objects."""
        # Create 1000 objects without special attributes
        results = []
        for i in range(1000):
            # Use different types that fall back to str()
            obj = f"string_object_{i}"
            codeflash_output = _get_qualified_name(obj); result = codeflash_output # 191μs -> 1.08ms (82.3% slower)
            results.append(result)

    def test_very_wide_parent_chain(self):
        """Test with 500 parents in a single chain."""
        # Create 500 parents
        parents = [Parent(f"P{i}") for i in range(500)]
        obj = FunctionToOptimizeWithFunctionName("m", parents=parents)
        # Call the function and verify it handles wide chains
        codeflash_output = _get_qualified_name(obj); result = codeflash_output # 33.2μs -> 25.3μs (31.2% faster)

    def test_concatenation_performance_consistency(self):
        """Test that string concatenation is consistent across large inputs."""
        # Create multiple objects with identical structure to test consistency
        results = []
        for _ in range(100):
            parents = [Parent(f"Class{i}") for i in range(50)]
            obj = FunctionToOptimizeWithFunctionName("method", parents=parents)
            codeflash_output = _get_qualified_name(obj); result = codeflash_output # 359μs -> 318μs (13.0% faster)
            results.append(result)

    def test_attribute_lookup_on_many_objects(self):
        """Test hasattr checks on 200 objects of different types."""
        # Create objects of different types
        objects = []
        # Add 50 with qualified_name
        for i in range(50):
            objects.append(FunctionToOptimizeWithQualifiedName(f"qn{i}"))
        # Add 50 with function_name
        for i in range(50):
            objects.append(FunctionToOptimizeWithFunctionName(f"fn{i}"))
        # Add 100 without special attributes
        for i in range(100):
            objects.append(f"string{i}")
        
        # Process all objects
        results = [_get_qualified_name(obj) for obj in objects]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-20T23.33.32

Click to see suggested changes
Suggested change
if hasattr(func, "qualified_name"):
return str(func.qualified_name)
# Build qualified name from function_name and parents
if hasattr(func, "function_name"):
parts = []
if hasattr(func, "parents") and func.parents:
for parent in func.parents:
if hasattr(parent, "name"):
parts.append(parent.name)
parts.append(func.function_name)
return ".".join(parts)
return str(func)
try:
q = func.qualified_name
except AttributeError:
# Build qualified name from function_name and parents
try:
fn = func.function_name
except AttributeError:
return str(func)
parts = []
parents = getattr(func, "parents", None)
if parents:
for parent in parents:
try:
name = parent.name
except AttributeError:
continue
parts.append(name)
parts.append(fn)
return ".".join(parts)
else:
return str(q)

Static Badge

Comment on lines +407 to +408
func_name = _get_function_name(target_function)
logger.debug("Java benchmarking for %s - using Maven Surefire timing", func_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 96% (0.96x) speedup for instrument_for_benchmarking in codeflash/languages/java/instrumentation.py

⏱️ Runtime : 537 microseconds 275 microseconds (best of 236 runs)

📝 Explanation and details

The optimized code achieves a 95% runtime reduction (537μs → 275μs) by eliminating unnecessary work when debug logging is disabled.

Key Optimization

The critical change is wrapping the function name extraction and debug logging in a logger guard:

if logger.isEnabledFor(logging.DEBUG):
    func_name = _get_function_name(target_function)
    logger.debug("Java benchmarking for %s - using Maven Surefire timing", func_name)

Why This Produces the Speedup

  1. Avoids expensive hasattr() checks: The line profiler shows _get_function_name() consumed 2.85ms (68.5% of total time) in the original version. This function performs multiple hasattr() checks and string conversions that are completely wasted when debug logging is off.

  2. Eliminates string formatting overhead: The logger.debug() call with string interpolation takes 1.20ms (28.8%) even when the message is discarded due to the logging level.

  3. In production, debug logging is typically disabled: The logger guard short-circuits immediately when DEBUG level is not enabled, avoiding both the function call and string formatting entirely.

The line profiler confirms this: in the optimized version, instrument_for_benchmarking spends 87.9% of its time in the logger guard check (877μs) and only 12.1% returning the test source (120μs). The _get_function_name function is barely called (9 hits vs 1084 in the original), indicating the guard is successfully preventing unnecessary work in most test iterations.

Impact on Workloads

Based on the function reference, instrument_for_benchmarking is called from test instrumentation code paths. While not in an innermost loop, it's called once per test case being benchmarked. With the optimization:

  • Production/CI environments (where debug logging is off): Near-instant execution, no overhead from function name extraction
  • Development with debug logging enabled: Same behavior as before, with negligible overhead from the guard check

The annotated tests show consistent 75-96% speedups across various scenarios (empty sources, large sources, many targets), demonstrating the optimization is effective regardless of input characteristics.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1065 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
# import the real functions and classes from the package under test
from codeflash.languages.java.instrumentation import (
    _get_function_name, instrument_for_benchmarking)
from codeflash.languages.java.parser import JavaAnalyzer

def test_basic_returns_same_source_when_function_has_function_name():
    # Create a simple Java test source string (typical usage)
    src = "public class Test { @Test public void testSomething() {} }"
    # Use a real JavaAnalyzer instance as the target "function" object.
    # We set a function_name attribute on a real object rather than using a fake class.
    target = JavaAnalyzer()
    target.function_name = "testSomething"  # the function name expected by the internals

    # Call instrument_for_benchmarking which should return the test_source unchanged.
    codeflash_output = instrument_for_benchmarking(src, target); result = codeflash_output # 1.37μs -> 741ns (85.3% faster)

def test_basic_returns_same_source_when_function_has_name_and_analyzer_passed():
    # Empty test source is a valid edge case in normal usage
    src = ""
    target = JavaAnalyzer()
    # Use the alternative attribute 'name' (no function_name provided)
    target.name = "someFunction"
    # Provide an analyzer instance explicitly to cover the optional parameter path
    analyzer = JavaAnalyzer()

    codeflash_output = instrument_for_benchmarking(src, target, analyzer=analyzer); result = codeflash_output # 1.82μs -> 1.04μs (75.0% faster)

def test_get_function_name_prefers_function_name_over_name():
    # If both attributes exist, function_name must be preferred.
    obj = JavaAnalyzer()
    obj.function_name = "preferred"
    obj.name = "other"

    # Directly test the helper to ensure deterministic behavior.
    got = _get_function_name(obj)

def test_get_function_name_uses_name_when_function_name_missing():
    # When function_name isn't present but name is, the helper should use name.
    obj = JavaAnalyzer()
    # don't set function_name
    obj.name = "only_name"

    got = _get_function_name(obj)

def test_get_function_name_converts_non_string_values_to_str():
    # If the attribute is not a string (e.g., int or None), it should be converted to str.
    obj_int = JavaAnalyzer()
    obj_int.function_name = 12345

    obj_none = JavaAnalyzer()
    obj_none.function_name = None

def test_get_function_name_raises_clear_error_when_no_name_attributes():
    # If neither attribute is present, an AttributeError must be raised with an informative message.
    obj = JavaAnalyzer()
    # Ensure no 'function_name' or 'name' attributes exist on the instance.
    if hasattr(obj, "function_name"):
        delattr(obj, "function_name")
    if hasattr(obj, "name"):
        delattr(obj, "name")

    with pytest.raises(AttributeError) as excinfo:
        _get_function_name(obj)

    # The message should reference the type of the object so debugging is easier.
    msg = str(excinfo.value)

def test_instrument_returns_unmodified_for_large_function_name_value():
    # Very long function_name should be handled (converted to string) and not affect returned source.
    src = "class X {}"
    obj = JavaAnalyzer()
    # create a long name (edge case with special characters, numbers, length)
    long_name = "f" * 500 + "_Ω_测试_#@"  # includes unicode and long repetition
    obj.function_name = long_name

    codeflash_output = instrument_for_benchmarking(src, obj); result = codeflash_output # 1.54μs -> 792ns (94.8% faster)

def test_instrument_with_empty_source_and_various_target_attributes():
    # Confirm empty source is left unchanged for multiple target attribute permutations.
    src = ""
    for attr_name in ("function_name", "name"):
        obj = JavaAnalyzer()
        setattr(obj, attr_name, "fname")
        # Should not raise and should return the empty string unchanged
        codeflash_output = instrument_for_benchmarking(src, obj) # 2.12μs -> 1.05μs (101% faster)

def test_large_scale_source_returned_unmodified():
    # Build a large source by repeating a typical line many times (1000 lines)
    lines = ["// line %d" % i for i in range(1000)]
    large_src = "\n".join(lines)

    target = JavaAnalyzer()
    target.name = "largeTest"

    # Call the function in a loop to simulate repeated instrumentation calls
    # This still must be deterministic and quick; we assert correctness each iteration.
    for _ in range(50):  # 50 iterations to exercise repeated-call behavior
        codeflash_output = instrument_for_benchmarking(large_src, target); res = codeflash_output # 23.4μs -> 12.7μs (84.8% faster)

def test_many_targets_with_various_attribute_types():
    # Create many target objects (1000) and ensure function works for each without mutation.
    src = "public class C{}"
    targets = []
    for i in range(1000):
        t = JavaAnalyzer()
        # Alternate between integer, string and None for function_name to exercise conversion logic
        if i % 3 == 0:
            t.function_name = i
        elif i % 3 == 1:
            t.name = f"n_{i}"
        else:
            t.function_name = None
        targets.append(t)

    # Ensure instrument_for_benchmarking returns expected source for all targets
    for t in targets:
        codeflash_output = instrument_for_benchmarking(src, t) # 475μs -> 242μs (96.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from unittest.mock import Mock

# imports
import pytest
from codeflash.discovery.functions_to_optimize import FunctionToOptimize
from codeflash.languages.java.instrumentation import (
    _get_function_name, instrument_for_benchmarking)
from codeflash.languages.java.parser import JavaAnalyzer

def test_get_function_name_from_function_name_attribute():
    """Test _get_function_name when function_name attribute exists."""
    mock_func = Mock()
    mock_func.function_name = "myFunction"
    
    result = _get_function_name(mock_func)

def test_get_function_name_from_name_attribute():
    """Test _get_function_name falls back to name attribute."""
    mock_func = Mock(spec=[])  # No function_name attribute
    mock_func.name = "myName"
    
    result = _get_function_name(mock_func)

def test_get_function_name_raises_on_missing_attributes():
    """Test _get_function_name raises AttributeError when no valid attribute exists."""
    mock_func = Mock(spec=[])  # No function_name or name attributes
    
    with pytest.raises(AttributeError) as exc_info:
        _get_function_name(mock_func)

def test_get_function_name_with_numeric_name():
    """Test _get_function_name converts numeric-like names to strings."""
    mock_func = Mock()
    mock_func.function_name = 12345
    
    result = _get_function_name(mock_func)

def test_get_function_name_with_none_value():
    """Test _get_function_name handles None in name attribute."""
    mock_func = Mock()
    mock_func.function_name = None
    
    result = _get_function_name(mock_func)

def test_instrument_for_benchmarking_function_name_priority():
    """Test that function_name is preferred over name attribute."""
    mock_func = Mock()
    mock_func.function_name = "primary"
    mock_func.name = "secondary"
    
    result = _get_function_name(mock_func)

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-20T23.44.09

Suggested change
func_name = _get_function_name(target_function)
logger.debug("Java benchmarking for %s - using Maven Surefire timing", func_name)
if logger.isEnabledFor(logging.DEBUG):
func_name = _get_function_name(target_function)
logger.debug("Java benchmarking for %s - using Maven Surefire timing", func_name)

Static Badge

Comment on lines +701 to +707
# Building the cumulative mapping is done once per distinct source and is faster than
# repeatedly decoding prefixes for many nodes.
# A local variable for append and encode reduces attribute lookups.
append = cum.append
for ch in decoded:
append(cum[-1] + len(ch.encode("utf8")))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 35% (0.35x) speedup for JavaAnalyzer._ensure_decoded in codeflash/languages/java/parser.py

⏱️ Runtime : 8.18 milliseconds 6.07 milliseconds (best of 161 runs)

📝 Explanation and details

The optimized code achieves a 34% runtime improvement by eliminating expensive per-character UTF-8 encoding operations in favor of direct byte-level parsing of UTF-8 sequences.

Key Optimization

Original approach: For each character in the decoded string, encode it back to UTF-8 to count bytes:

for ch in decoded:
    append(cum[-1] + len(ch.encode("utf8")))  # 77.7% of total time

Optimized approach: Parse UTF-8 byte sequences directly by examining the leading byte to determine character length:

while byte_idx < source_len:
    byte = source[byte_idx]
    if byte < 0x80:      # 1-byte (ASCII)
        char_bytes = 1
    elif byte < 0xE0:    # 2-byte
        char_bytes = 2
    elif byte < 0xF0:    # 3-byte  
        char_bytes = 3
    else:                # 4-byte
        char_bytes = 4
    byte_idx += char_bytes
    cum.append(byte_idx)

Why It's Faster

  1. Eliminates redundant encoding: The original code decodes UTF-8 to a string, then re-encodes each character back to UTF-8. The optimization works directly with the already-encoded bytes, avoiding this roundtrip entirely.

  2. UTF-8 structure exploitation: UTF-8 encoding embeds character length in the first byte's bit pattern. By reading this pattern directly (via simple integer comparisons), we determine how many bytes each character occupies without any encoding operations.

  3. Reduced Python overhead: String iteration and character encoding involve significant Python interpreter overhead. Direct byte array indexing with integer comparisons is much lighter weight.

Performance Characteristics

The test results show consistent speedups across all workload types:

  • ASCII-heavy code (37-42% faster): Most code is ASCII, and the byte < 0x80 check is extremely fast
  • Mixed content (30-40% faster): Java files with UTF-8 comments still benefit significantly
  • Large files (31-39% faster): The optimization scales linearly, making it particularly valuable for large Java source files (1000+ lines)
  • Multi-byte characters (36-41% faster): Even emoji-heavy content benefits, as parsing byte patterns is faster than encoding

The optimization maintains correctness - both approaches produce identical cumulative byte mappings. The cache hit path (early return when source is self._cached_source_bytes) is preserved and unaffected.

This optimization is particularly impactful for Java code analysis workflows that process many or large source files, where _ensure_decoded is called repeatedly during parsing operations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1172 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
from codeflash.languages.java.parser import JavaAnalyzer

def test_basic_ascii_and_idempotent_on_same_bytes_object():
    # Create a JavaAnalyzer instance using the real constructor
    analyzer = JavaAnalyzer()

    # Prepare a simple ASCII bytes source. Use a bytes object we will pass twice (same identity).
    source = b"public class A { void m() {} }"

    # First call: should decode and populate caches
    analyzer._ensure_decoded(source) # 4.82μs -> 3.50μs (37.8% faster)

    # The cumulative bytes list length should be number_of_chars + 1
    decoded = analyzer._cached_source_str

    # For ASCII every character is 1 byte, so cum should be [0,1,2,...,len(decoded)]
    expected_cum = list(range(0, len(decoded) + 1))

    # Save identity of the cumulative list
    cum_id_before = id(analyzer._cached_cum_bytes)

    # Second call with the exact same bytes object should return early and not rebuild caches.
    analyzer._ensure_decoded(source) # 310ns -> 311ns (0.322% slower)

def test_rebuild_on_different_bytes_object_with_same_content():
    analyzer = JavaAnalyzer()

    # Original source bytes
    source1 = b"abc123"

    # Call once to populate caches
    analyzer._ensure_decoded(source1) # 2.32μs -> 1.78μs (30.3% faster)
    cum_first = analyzer._cached_cum_bytes
    cached_bytes_first = analyzer._cached_source_bytes

    # Create a new bytes object with identical content but a different identity.
    # Using concatenation ensures a new bytes object is created.
    source2 = source1 + b""

    # Call with the new bytes object — because identity differs, it should rebuild caches.
    analyzer._ensure_decoded(source2) # 271ns -> 271ns (0.000% faster)

def test_empty_bytes_produces_zero_length_cumulative_mapping():
    analyzer = JavaAnalyzer()

    # Empty bytes
    source = b""

    # Ensure decoding and mapping works for empty input
    analyzer._ensure_decoded(source) # 1.05μs -> 792ns (32.8% faster)

def test_multibyte_unicode_cumulative_mapping_is_correct():
    analyzer = JavaAnalyzer()

    # Create a string with multi-byte characters: ASCII, accented, and emoji
    decoded_string = "aé😊"  # 'a' -> 1 byte, 'é' -> 2 bytes, '😊' -> 4 bytes in UTF-8
    source = decoded_string.encode("utf8")

    # Ensure decoding and cumulative mapping
    analyzer._ensure_decoded(source) # 3.04μs -> 2.32μs (30.6% faster)

    # Manually compute expected cumulative mapping by summing bytes per character
    expected_cum = [0]
    total = 0
    for ch in decoded_string:
        total += len(ch.encode("utf8"))
        expected_cum.append(total)

def test_non_bytes_input_raises_attribute_error():
    analyzer = JavaAnalyzer()

    # Passing a string (not bytes) should raise an AttributeError because .decode is not available on str
    with pytest.raises(AttributeError):
        analyzer._ensure_decoded("this is a str, not bytes")  # type: ignore[arg-type]

    # Passing None should also raise an AttributeError when attempting to call .decode
    with pytest.raises(AttributeError):
        analyzer._ensure_decoded(None)  # type: ignore[arg-type]

def test_large_scale_mixed_characters_and_repeated_calls():
    analyzer = JavaAnalyzer()

    # Build a long decoded string (1000 characters) mixing ASCII and multi-byte characters
    parts = []
    for i in range(1000):
        # Cycle characters: ASCII, 2-byte accented, 4-byte emoji
        if i % 3 == 0:
            parts.append("x")       # 1 byte
        elif i % 3 == 1:
            parts.append("é")       # typically 2 bytes in UTF-8
        else:
            parts.append("💡")      # emoji (often 4 bytes in UTF-8)
    decoded = "".join(parts)
    source = decoded.encode("utf8")

    # Ensure initial decode builds the mapping correctly
    analyzer._ensure_decoded(source) # 109μs -> 76.7μs (42.3% faster)

    # Repeatedly call _ensure_decoded with the same bytes object up to 1000 iterations.
    # This should be cheap (early return) and should not change the cached cumulative mapping object identity.
    cum_before = analyzer._cached_cum_bytes
    for _ in range(1000):
        analyzer._ensure_decoded(source) # 162μs -> 159μs (2.45% faster)

    # Now create a different bytes object with identical content to force a rebuild and verify new identity
    source_new = source + b""  # ensures a new bytes object

    analyzer._ensure_decoded(source_new) # 160ns -> 160ns (0.000% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.parser import JavaAnalyzer

def test_basic_ascii_decode():
    """Test that basic ASCII bytes are decoded correctly and cached."""
    analyzer = JavaAnalyzer()
    source = b"public class Hello {}"
    
    # Call _ensure_decoded
    analyzer._ensure_decoded(source) # 3.98μs -> 3.07μs (29.7% faster)

def test_utf8_characters_decode():
    """Test that UTF-8 multi-byte characters are decoded and mapped correctly."""
    analyzer = JavaAnalyzer()
    # Japanese characters: each "あ" is 3 bytes in UTF-8
    source = "あいう".encode("utf8")
    
    analyzer._ensure_decoded(source) # 2.85μs -> 2.30μs (23.5% faster)

def test_empty_source():
    """Test decoding empty bytes."""
    analyzer = JavaAnalyzer()
    source = b""
    
    analyzer._ensure_decoded(source) # 1.02μs -> 782ns (30.7% faster)

def test_single_character():
    """Test decoding a single ASCII character."""
    analyzer = JavaAnalyzer()
    source = b"a"
    
    analyzer._ensure_decoded(source) # 1.57μs -> 1.20μs (30.9% faster)

def test_caching_same_source_returns_early():
    """Test that calling _ensure_decoded with the same source object returns early."""
    analyzer = JavaAnalyzer()
    source = b"test code"
    
    # First call
    analyzer._ensure_decoded(source) # 2.46μs -> 2.01μs (21.9% faster)
    first_cached_str = analyzer._cached_source_str
    first_cached_cum = analyzer._cached_cum_bytes
    
    # Second call with identical object
    analyzer._ensure_decoded(source) # 260ns -> 261ns (0.383% slower)

def test_different_source_replaces_cache():
    """Test that a different source bytes object replaces the cache."""
    analyzer = JavaAnalyzer()
    source1 = b"first"
    source2 = b"second"
    
    analyzer._ensure_decoded(source1) # 2.00μs -> 1.61μs (24.2% faster)
    cached_str_1 = analyzer._cached_source_str
    
    analyzer._ensure_decoded(source2) # 1.37μs -> 1.06μs (29.3% faster)

def test_cumulative_bytes_increases_monotonically():
    """Test that cumulative byte counts are monotonically increasing."""
    analyzer = JavaAnalyzer()
    source = "a\u00e9\u4e00".encode("utf8")  # 'a' (1 byte), 'é' (2 bytes), '中' (3 bytes)
    
    analyzer._ensure_decoded(source) # 2.83μs -> 2.16μs (31.0% faster)
    
    cum = analyzer._cached_cum_bytes
    # Check monotonic increase
    for i in range(len(cum) - 1):
        pass

def test_newlines_and_whitespace():
    """Test that newlines and whitespace are handled correctly."""
    analyzer = JavaAnalyzer()
    source = b"line1\nline2\ttab"
    
    analyzer._ensure_decoded(source) # 2.97μs -> 2.38μs (24.9% faster)

def test_special_java_symbols():
    """Test that special Java symbols are decoded correctly."""
    analyzer = JavaAnalyzer()
    source = b"public static void main(String[] args) { }"
    
    analyzer._ensure_decoded(source) # 5.84μs -> 4.25μs (37.5% faster)
    
    decoded = analyzer._cached_source_str

def test_very_long_ascii_string():
    """Test that very long ASCII strings are decoded and mapped correctly."""
    analyzer = JavaAnalyzer()
    # Create a long string of 1000 ASCII characters
    long_string = "a" * 1000
    source = long_string.encode("utf8")
    
    analyzer._ensure_decoded(source) # 86.2μs -> 65.1μs (32.5% faster)

def test_mixed_ascii_and_multibyte_utf8():
    """Test a string with mixed ASCII and multi-byte UTF-8 characters."""
    analyzer = JavaAnalyzer()
    # Mix of ASCII and emoji (4 bytes each in UTF-8)
    mixed = "Hello😊World🎉End".encode("utf8")
    
    analyzer._ensure_decoded(mixed) # 4.44μs -> 3.35μs (32.6% faster)
    
    decoded = analyzer._cached_source_str
    
    # Verify cumulative bytes increase correctly
    cum = analyzer._cached_cum_bytes

def test_zero_width_and_combining_characters():
    """Test Unicode combining characters and zero-width spaces."""
    analyzer = JavaAnalyzer()
    # Combining diacritical mark (zero-width) + zero-width space
    source = "e\u0301\u200b".encode("utf8")  # é (combining), zero-width space
    
    analyzer._ensure_decoded(source) # 2.65μs -> 2.09μs (26.3% faster)
    
    cum = analyzer._cached_cum_bytes

def test_source_with_null_bytes():
    """Test that source containing null bytes is handled."""
    analyzer = JavaAnalyzer()
    source = b"code\x00more"
    
    analyzer._ensure_decoded(source) # 2.47μs -> 1.87μs (32.1% faster)

def test_consecutive_calls_with_different_sources():
    """Test that multiple consecutive calls with different sources update cache correctly."""
    analyzer = JavaAnalyzer()
    sources = [b"first", b"second", b"third"]
    
    for source in sources:
        analyzer._ensure_decoded(source) # 4.47μs -> 3.48μs (28.5% faster)

def test_source_with_only_multibyte_characters():
    """Test source containing only multi-byte UTF-8 characters."""
    analyzer = JavaAnalyzer()
    # Russian characters (2 bytes each in UTF-8)
    source = "привет".encode("utf8")
    
    analyzer._ensure_decoded(source) # 3.10μs -> 2.29μs (34.9% faster)
    cum = analyzer._cached_cum_bytes

def test_byte_equality_vs_reference_equality():
    """Test that the function uses identity ('is') not equality ('==') for cache check."""
    analyzer = JavaAnalyzer()
    source1 = b"test"
    source2 = b"test"  # Equal but different object
    
    analyzer._ensure_decoded(source1) # 1.92μs -> 1.50μs (28.0% faster)
    cached_after_first = analyzer._cached_source_str
    
    # Should not use cache since source2 is a different object
    analyzer._ensure_decoded(source2) # 301ns -> 310ns (2.90% slower)

def test_surrogate_pair_characters():
    """Test that surrogate pair characters (emoji) are handled correctly."""
    analyzer = JavaAnalyzer()
    # Emoji with variation selector
    source = "🔥🌟✨".encode("utf8")
    
    analyzer._ensure_decoded(source) # 2.96μs -> 2.30μs (28.3% faster)
    
    decoded = analyzer._cached_source_str
    
    cum = analyzer._cached_cum_bytes

def test_right_to_left_text():
    """Test that right-to-left text (Hebrew, Arabic) is decoded correctly."""
    analyzer = JavaAnalyzer()
    source = "שלום".encode("utf8")  # Hebrew "hello"
    
    analyzer._ensure_decoded(source) # 2.75μs -> 2.06μs (33.0% faster)

def test_cumulative_bytes_at_each_character_boundary():
    """Test that cumulative bytes correctly marks byte position at each character boundary."""
    analyzer = JavaAnalyzer()
    # Controlled mix: 'a' (1 byte), 'ñ' (2 bytes), '中' (3 bytes)
    source = "añ中".encode("utf8")
    
    analyzer._ensure_decoded(source) # 2.78μs -> 2.12μs (30.7% faster)
    
    cum = analyzer._cached_cum_bytes

def test_large_java_file_simulation():
    """Test decoding a large simulated Java file (1000+ lines)."""
    analyzer = JavaAnalyzer()
    
    # Simulate a large Java file with 1000 lines
    lines = []
    for i in range(1000):
        lines.append(f"    System.out.println(\"Line {i}\");")
    
    large_source = "\n".join(lines).encode("utf8")
    
    analyzer._ensure_decoded(large_source) # 3.14ms -> 2.39ms (31.0% faster)
    
    decoded = analyzer._cached_source_str
    
    cum = analyzer._cached_cum_bytes

def test_large_string_with_multibyte_characters():
    """Test decoding a large string with many multi-byte UTF-8 characters."""
    analyzer = JavaAnalyzer()
    
    # Create a large string with repeated multi-byte characters
    # Emoji repeated 500 times = 2000 bytes
    large_source = ("😊" * 500).encode("utf8")
    
    analyzer._ensure_decoded(large_source) # 60.5μs -> 44.3μs (36.3% faster)
    
    decoded = analyzer._cached_source_str
    
    cum = analyzer._cached_cum_bytes

def test_alternating_single_and_multibyte_at_scale():
    """Test large-scale alternating single-byte and multi-byte characters."""
    analyzer = JavaAnalyzer()
    
    # Alternate ASCII 'a' with 3-byte character '中'
    pattern = "a中" * 500  # 500 repetitions = 1000 characters
    large_source = pattern.encode("utf8")
    
    analyzer._ensure_decoded(large_source) # 105μs -> 75.0μs (40.7% faster)
    
    decoded = analyzer._cached_source_str
    
    cum = analyzer._cached_cum_bytes
    
    # Check cumulative pattern
    for i in range(1, len(cum)):
        pass

def test_cache_efficiency_with_repeated_source():
    """Test that repeated calls with same source don't rebuild cache."""
    analyzer = JavaAnalyzer()
    source = b"repeated" * 100
    
    # Call multiple times
    for _ in range(100):
        analyzer._ensure_decoded(source) # 85.7μs -> 67.2μs (27.5% faster)

def test_large_source_with_java_syntax():
    """Test large source with realistic Java syntax and UTF-8 comments."""
    analyzer = JavaAnalyzer()
    
    # Build a large Java-like source with comments
    source_lines = [
        "public class LargeClass {",
        "    // UTF-8 comment: こんにちは世界",
    ]
    
    for i in range(500):
        source_lines.append(f"    public void method{i}() {{")
        source_lines.append(f"        // Method {i}: Σ symbol test")
        source_lines.append("        int x = 42;")
        source_lines.append("    }")
    
    source_lines.append("}")
    
    large_source = "\n".join(source_lines).encode("utf8")
    
    analyzer._ensure_decoded(large_source) # 4.23ms -> 3.04ms (39.4% faster)
    
    decoded = analyzer._cached_source_str
    
    cum = analyzer._cached_cum_bytes

def test_performance_linear_with_source_size():
    """Test that cache building completes in reasonable time for large sources."""
    analyzer = JavaAnalyzer()
    
    # Test with progressively larger sources
    for size in [100, 500, 1000]:
        source = (b"a" * size)
        analyzer._ensure_decoded(source) # 138μs -> 104μs (32.3% faster)
        
        cum = analyzer._cached_cum_bytes
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-20T23.57.56

Click to see suggested changes
Suggested change
# Building the cumulative mapping is done once per distinct source and is faster than
# repeatedly decoding prefixes for many nodes.
# A local variable for append and encode reduces attribute lookups.
append = cum.append
for ch in decoded:
append(cum[-1] + len(ch.encode("utf8")))
byte_idx = 0
source_len = len(source)
while byte_idx < source_len:
byte = source[byte_idx]
# Determine UTF-8 character byte length from the first byte:
# 0xxxxxxx -> 1 byte (ASCII)
# 110xxxxx -> 2 bytes
# 1110xxxx -> 3 bytes
# 11110xxx -> 4 bytes
if byte < 0x80:
char_bytes = 1
elif byte < 0xE0:
char_bytes = 2
elif byte < 0xF0:
char_bytes = 3
else:
char_bytes = 4
byte_idx += char_bytes
cum.append(byte_idx)

Static Badge

Comment on lines +705 to +707
for ch in decoded:
append(cum[-1] + len(ch.encode("utf8")))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 32% (0.32x) speedup for JavaAnalyzer.byte_to_char_index in codeflash/languages/java/parser.py

⏱️ Runtime : 5.60 milliseconds 4.25 milliseconds (best of 28 runs)

📝 Explanation and details

The optimized code achieves a 31% runtime improvement by eliminating repeated per-character UTF-8 encoding operations during the cumulative byte mapping construction in _ensure_decoded.

Key optimization:
Instead of encoding each character individually with ch.encode("utf8") (which allocates a new bytes object for every character), the optimized version directly scans the original UTF-8 byte array and determines each code point's byte length by inspecting the leading byte's bit pattern:

  • 0xxxxxxx → 1 byte (ASCII)
  • 110xxxxx → 2 bytes
  • 1110xxxx → 3 bytes
  • 11110xxx → 4 bytes

Why this is faster:

  1. Reduces allocations: The original code creates ~53,000 small bytes objects during encoding (visible in line profiler: 53,273 calls to ch.encode("utf8") taking 74.9% of _ensure_decoded time). The optimized version performs zero allocations in the loop.
  2. Lower Python overhead: Direct byte indexing and bitwise operations (s[i], b & 0xE0) are faster than method calls and string encoding.
  3. Better cache locality: Sequential byte array access is more cache-friendly than creating and immediately discarding many small objects.

Performance characteristics:

  • The optimization is most effective for ASCII-heavy workloads (35.9% speedup for 1000-character ASCII string) since the hot path (b < 0x80) executes with minimal branching
  • Still provides solid gains for mixed multibyte content (10-20% typical speedup across tests)
  • Particularly impactful when _ensure_decoded is called repeatedly with new sources, as the expensive per-character encoding no longer dominates

Impact on calling contexts:
Since byte_to_char_index is used for mapping tree-sitter parse node byte offsets to character positions, any code performing syntax analysis or AST traversal will benefit. The cache-miss penalty (when a new source is decoded) is significantly reduced, making parser-heavy operations more efficient.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1831 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from bisect import \
    bisect_right  # used to compute expected results independently

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.parser import JavaAnalyzer

# function to test

def expected_byte_to_char_index(byte_offset: int, source: bytes) -> int:
    """
    Independent helper to compute the expected character index for a given byte offset and source.
    This mirrors the logical intent of JavaAnalyzer.byte_to_char_index but constructs its own
    cumulative mapping from scratch to avoid relying on any internal caching.
    """
    # Decode the bytes to a string (this must succeed for valid UTF-8 bytes)
    decoded = source.decode("utf8")
    # Build cumulative byte counts per character: cum[0] == 0, cum[i] == bytes for first i chars
    cum = [0]
    for ch in decoded:
        cum.append(cum[-1] + len(ch.encode("utf8")))
    # Use bisect_right and subtract 1 to compute the character index (same semantics as the implementation)
    return bisect_right(cum, byte_offset) - 1

def test_basic_ascii_offsets():
    # Create an analyzer instance (real object, no mocks)
    analyzer = JavaAnalyzer()
    # ASCII source; each character is one byte
    source = b"hello"
    # Check a variety of byte offsets from 0 to total bytes (inclusive)
    for byte_offset in range(0, len(source) + 1):  # offsets 0..5
        # Call the method under test
        codeflash_output = analyzer.byte_to_char_index(byte_offset, source); result = codeflash_output # 5.14μs -> 4.65μs (10.6% faster)
        # Compute expected result independently
        expected = expected_byte_to_char_index(byte_offset, source)

def test_multibyte_characters_inside_and_on_boundaries():
    analyzer = JavaAnalyzer()
    # Build a string with a mix of 1-byte and multi-byte characters:
    # 'a' (1 byte), '€' (3 bytes), '日' (3 bytes), 'b' (1 byte)
    s = "a€日b"
    source = s.encode("utf8")
    # Precompute the cumulative byte boundaries to reason about expected values
    # Check offsets around each boundary and inside multi-byte characters.
    total_bytes = len(source)
    # We'll test every possible offset from 0..total_bytes inclusive to ensure correct mapping
    for byte_offset in range(0, total_bytes + 1):
        codeflash_output = analyzer.byte_to_char_index(byte_offset, source); got = codeflash_output # 6.81μs -> 6.17μs (10.4% faster)
        expected = expected_byte_to_char_index(byte_offset, source)

def test_empty_source_and_out_of_range_offsets():
    analyzer = JavaAnalyzer()
    empty = b""
    # For empty source, index should be 0 for offset 0
    codeflash_output = analyzer.byte_to_char_index(0, empty) # 1.60μs -> 1.54μs (3.89% faster)
    # For offsets greater than total bytes (which is 0), the implementation maps to 0 (no characters)
    codeflash_output = analyzer.byte_to_char_index(1, empty) # 521ns -> 481ns (8.32% faster)
    codeflash_output = analyzer.byte_to_char_index(999, empty) # 311ns -> 290ns (7.24% faster)

def test_negative_offset_returns_minus_one_and_invalid_types_raise():
    analyzer = JavaAnalyzer()
    source = b"abc"
    # Negative offsets: bisect_right semantics produce -1 after subtracting 1
    codeflash_output = analyzer.byte_to_char_index(-1, source) # 2.63μs -> 2.35μs (12.4% faster)
    # Passing None for byte_offset should raise a TypeError (comparison between ints and None)
    with pytest.raises(TypeError):
        analyzer.byte_to_char_index(None, source) # 3.12μs -> 3.06μs (1.96% faster)
    # Passing a non-bytes source (e.g., a Python str) should raise an AttributeError when .decode is attempted
    with pytest.raises(AttributeError):
        analyzer.byte_to_char_index(0, "not-bytes") # 2.33μs -> 2.33μs (0.000% faster)

def test_large_scale_mixed_characters_repeated_calls_and_stability():
    analyzer = JavaAnalyzer()
    # Create a deterministic repeating pattern of characters, including multi-byte characters
    pattern = "a€日😀"  # lengths: 1,3,3,4 bytes respectively in UTF-8
    # Repeat until at least 1000 characters, then truncate to exactly 1000 characters
    repeats = (1000 // len(pattern)) + 1
    big_string = (pattern * repeats)[:1000]  # exactly 1000 characters
    source = big_string.encode("utf8")
    total_bytes = len(source)

    # Precompute expected mapping once to compare many times (exercise the analyzer's caching)
    expected_map = [expected_byte_to_char_index(i, source) for i in range(total_bytes + 1)]

    # Call the method many times (1000 iterations) to both exercise caching and verify stability
    for iteration in range(1000):
        # Choose a deterministic offset for this iteration (wrap-around using modulo)
        offset = iteration % (total_bytes + 1)
        codeflash_output = analyzer.byte_to_char_index(offset, source); got = codeflash_output # 489μs -> 471μs (3.95% faster)
        expected = expected_map[offset]

    # Additionally, spot-check a selection of offsets near the end of the byte sequence
    for offset in range(max(0, total_bytes - 10), total_bytes + 1):
        codeflash_output = analyzer.byte_to_char_index(offset, source) # 4.57μs -> 4.38μs (4.43% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.parser import JavaAnalyzer

def test_basic_ascii_single_byte_character():
    """Test conversion of byte offset for ASCII single-byte characters."""
    analyzer = JavaAnalyzer()
    source = b"hello"  # All ASCII: 1 byte per character
    # Byte offset 0 should map to character index 0
    codeflash_output = analyzer.byte_to_char_index(0, source) # 3.06μs -> 2.69μs (13.8% faster)
    # Byte offset 1 should map to character index 1
    codeflash_output = analyzer.byte_to_char_index(1, source) # 631ns -> 601ns (4.99% faster)
    # Byte offset 4 should map to character index 4
    codeflash_output = analyzer.byte_to_char_index(4, source) # 411ns -> 340ns (20.9% faster)

def test_basic_ascii_sequential_offsets():
    """Test sequential byte offsets in ASCII text."""
    analyzer = JavaAnalyzer()
    source = b"abc"  # 3 ASCII characters, 3 bytes total
    # Each character is at its corresponding index for ASCII
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.65μs -> 2.33μs (13.7% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 591ns -> 511ns (15.7% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 400ns -> 350ns (14.3% faster)

def test_zero_byte_offset_returns_zero():
    """Test that byte offset 0 always maps to character index 0."""
    analyzer = JavaAnalyzer()
    source = b"test"
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.69μs -> 2.29μs (17.0% faster)

def test_multibyte_utf8_character():
    """Test conversion with multibyte UTF-8 characters."""
    analyzer = JavaAnalyzer()
    # "é" is 2 bytes in UTF-8, "a" is 1 byte
    source = "éa".encode("utf8")  # b'\xc3\xa9a' (3 bytes)
    # Byte offset 0 maps to character 0 (start of "é")
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.96μs -> 2.73μs (8.44% faster)
    # Byte offset 1 maps to character 0 (still in "é" which spans bytes 0-1)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 562ns -> 490ns (14.7% faster)
    # Byte offset 2 maps to character 1 (start of "a")
    codeflash_output = analyzer.byte_to_char_index(2, source) # 381ns -> 360ns (5.83% faster)

def test_multiple_multibyte_characters():
    """Test with multiple multibyte characters."""
    analyzer = JavaAnalyzer()
    # "café" = 'c'(1) + 'a'(1) + 'f'(1) + 'é'(2) = 5 bytes, 4 characters
    source = "café".encode("utf8")
    codeflash_output = analyzer.byte_to_char_index(0, source) # 3.26μs -> 2.83μs (14.9% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 581ns -> 541ns (7.39% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 421ns -> 361ns (16.6% faster)
    codeflash_output = analyzer.byte_to_char_index(3, source) # 391ns -> 340ns (15.0% faster)
    codeflash_output = analyzer.byte_to_char_index(4, source) # 351ns -> 310ns (13.2% faster)

def test_chinese_characters():
    """Test with multibyte characters from non-Latin scripts."""
    analyzer = JavaAnalyzer()
    # "你好" = two 3-byte UTF-8 characters
    source = "你好".encode("utf8")  # 6 bytes, 2 characters
    codeflash_output = analyzer.byte_to_char_index(0, source) # 3.16μs -> 2.73μs (15.4% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 571ns -> 501ns (14.0% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 331ns -> 300ns (10.3% faster)
    codeflash_output = analyzer.byte_to_char_index(3, source) # 380ns -> 351ns (8.26% faster)
    codeflash_output = analyzer.byte_to_char_index(4, source) # 350ns -> 321ns (9.03% faster)
    codeflash_output = analyzer.byte_to_char_index(5, source) # 300ns -> 280ns (7.14% faster)

def test_empty_source():
    """Test with empty source bytes."""
    analyzer = JavaAnalyzer()
    source = b""
    # Byte offset 0 in empty source should map to character index 0
    codeflash_output = analyzer.byte_to_char_index(0, source) # 1.49μs -> 1.44μs (3.47% faster)

def test_single_ascii_character():
    """Test with source containing only one ASCII character."""
    analyzer = JavaAnalyzer()
    source = b"x"
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.04μs -> 1.84μs (10.8% faster)

def test_single_multibyte_character():
    """Test with source containing only one multibyte character."""
    analyzer = JavaAnalyzer()
    source = "ñ".encode("utf8")  # 2 bytes, 1 character
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.65μs -> 2.44μs (8.62% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 551ns -> 511ns (7.83% faster)

def test_offset_at_end_boundary():
    """Test byte offset at the end of source."""
    analyzer = JavaAnalyzer()
    source = b"ab"  # 2 bytes
    # Offset 1 (end - 1) should map to character 1
    codeflash_output = analyzer.byte_to_char_index(1, source) # 2.37μs -> 2.07μs (14.5% faster)

def test_offset_beyond_source_length():
    """Test byte offset beyond source length."""
    analyzer = JavaAnalyzer()
    source = b"hi"  # 2 bytes
    # Offset 3 is beyond the source; should still return a valid character index
    # bisect_right will return len(cum) - 1, which is the last character index
    codeflash_output = analyzer.byte_to_char_index(3, source); result = codeflash_output # 2.22μs -> 1.99μs (11.6% faster)

def test_mixed_single_and_multibyte_characters():
    """Test string with mix of single and multibyte characters."""
    analyzer = JavaAnalyzer()
    # "a€b" = 'a'(1) + '€'(3) + 'b'(1) = 5 bytes, 3 characters
    source = "a€b".encode("utf8")
    codeflash_output = analyzer.byte_to_char_index(0, source) # 3.52μs -> 2.96μs (19.0% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 590ns -> 501ns (17.8% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 401ns -> 341ns (17.6% faster)
    codeflash_output = analyzer.byte_to_char_index(3, source) # 311ns -> 291ns (6.87% faster)
    codeflash_output = analyzer.byte_to_char_index(4, source) # 361ns -> 340ns (6.18% faster)

def test_emoji_multibyte_character():
    """Test with emoji which uses 4 bytes in UTF-8."""
    analyzer = JavaAnalyzer()
    # "😀" is a 4-byte UTF-8 character
    source = "😀".encode("utf8")
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.81μs -> 2.52μs (11.1% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 572ns -> 490ns (16.7% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 321ns -> 291ns (10.3% faster)
    codeflash_output = analyzer.byte_to_char_index(3, source) # 301ns -> 271ns (11.1% faster)

def test_newline_character():
    """Test with newline character."""
    analyzer = JavaAnalyzer()
    source = b"line1\nline2"  # Newline is 1 byte in ASCII
    # Offset 5 should be at the newline
    codeflash_output = analyzer.byte_to_char_index(5, source) # 3.38μs -> 2.79μs (21.2% faster)
    # Offset 6 should be at 'l' of line2
    codeflash_output = analyzer.byte_to_char_index(6, source) # 601ns -> 561ns (7.13% faster)

def test_tab_character():
    """Test with tab character."""
    analyzer = JavaAnalyzer()
    source = b"a\tb"  # Tab is 1 byte
    codeflash_output = analyzer.byte_to_char_index(0, source) # 2.51μs -> 2.25μs (11.5% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 571ns -> 491ns (16.3% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 391ns -> 350ns (11.7% faster)

def test_cache_reuse_same_source():
    """Test that cache is reused for identical source."""
    analyzer = JavaAnalyzer()
    source = b"test"
    # First call
    codeflash_output = analyzer.byte_to_char_index(2, source); result1 = codeflash_output # 2.56μs -> 2.28μs (11.9% faster)
    # Second call with same source should use cache
    codeflash_output = analyzer.byte_to_char_index(2, source); result2 = codeflash_output # 500ns -> 460ns (8.70% faster)

def test_cache_invalidation_different_source():
    """Test that cache is invalidated when source changes."""
    analyzer = JavaAnalyzer()
    source1 = b"hello"
    source2 = b"world"
    # Call with first source
    codeflash_output = analyzer.byte_to_char_index(1, source1); result1 = codeflash_output # 2.85μs -> 2.25μs (26.2% faster)
    # Call with different source should update cache
    codeflash_output = analyzer.byte_to_char_index(1, source2); result2 = codeflash_output # 1.81μs -> 1.37μs (32.2% faster)

def test_zero_offset_multibyte_source():
    """Test zero offset with multibyte characters."""
    analyzer = JavaAnalyzer()
    source = "日本".encode("utf8")  # Two 3-byte characters
    codeflash_output = analyzer.byte_to_char_index(0, source) # 3.29μs -> 2.85μs (15.1% faster)

def test_offset_in_middle_of_multibyte_sequence():
    """Test offset in the middle of a multibyte character sequence."""
    analyzer = JavaAnalyzer()
    # "中" is 3 bytes: E4 B8 AD
    source = "中".encode("utf8")
    # Byte offset 1 is in the middle of the character
    codeflash_output = analyzer.byte_to_char_index(1, source) # 2.74μs -> 2.39μs (14.3% faster)

def test_large_ascii_string():
    """Test with large ASCII string (1000 characters)."""
    analyzer = JavaAnalyzer()
    source = b"a" * 1000  # 1000 ASCII characters
    # Test various offsets
    codeflash_output = analyzer.byte_to_char_index(0, source) # 90.1μs -> 66.3μs (35.9% faster)
    codeflash_output = analyzer.byte_to_char_index(500, source) # 821ns -> 781ns (5.12% faster)
    codeflash_output = analyzer.byte_to_char_index(999, source) # 511ns -> 492ns (3.86% faster)

def test_large_multibyte_string():
    """Test with large string of multibyte characters."""
    analyzer = JavaAnalyzer()
    # Create a string with 100 repetitions of "é" (2 bytes each)
    source = ("é" * 100).encode("utf8")  # 200 bytes, 100 characters
    codeflash_output = analyzer.byte_to_char_index(0, source) # 13.9μs -> 11.8μs (17.2% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 591ns -> 591ns (0.000% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 461ns -> 441ns (4.54% faster)
    codeflash_output = analyzer.byte_to_char_index(199, source) # 411ns -> 391ns (5.12% faster)

def test_large_mixed_multibyte_characters():
    """Test with large mixed multibyte character string."""
    analyzer = JavaAnalyzer()
    # Create a pattern: "你好" (6 bytes, 2 chars) repeated 50 times
    base = "你好"
    source = (base * 50).encode("utf8")  # 300 bytes, 100 characters
    # Test boundaries
    codeflash_output = analyzer.byte_to_char_index(0, source) # 15.9μs -> 14.1μs (13.1% faster)
    codeflash_output = analyzer.byte_to_char_index(6, source) # 671ns -> 561ns (19.6% faster)
    codeflash_output = analyzer.byte_to_char_index(12, source) # 460ns -> 381ns (20.7% faster)
    codeflash_output = analyzer.byte_to_char_index(299, source) # 401ns -> 381ns (5.25% faster)

def test_repeated_calls_many_offsets():
    """Test many repeated calls with different offsets."""
    analyzer = JavaAnalyzer()
    source = "abc" * 100  # 300 ASCII characters
    # Call for various offsets (tests cache efficiency)
    for i in range(0, 300, 10):
        codeflash_output = analyzer.byte_to_char_index(i, source); result = codeflash_output

def test_repeated_calls_alternating_sources():
    """Test many calls alternating between two sources."""
    analyzer = JavaAnalyzer()
    source1 = b"hello" * 100  # 500 bytes
    source2 = b"world" * 100  # 500 bytes
    # Alternate between sources
    for i in range(50):
        codeflash_output = analyzer.byte_to_char_index(i, source1); result1 = codeflash_output # 2.29ms -> 1.66ms (38.1% faster)
        codeflash_output = analyzer.byte_to_char_index(i, source2); result2 = codeflash_output # 2.29ms -> 1.65ms (38.5% faster)

def test_dense_multibyte_coverage():
    """Test comprehensive coverage of byte-to-character mapping."""
    analyzer = JavaAnalyzer()
    # Create string with predictable byte structure
    # "café" pattern repeated: each pattern is c(1) a(1) f(1) é(2) = 5 bytes
    source = ("café" * 50).encode("utf8")
    # Verify multiple offsets map correctly
    # First pattern: bytes 0-4 map to chars 0-3
    codeflash_output = analyzer.byte_to_char_index(0, source) # 24.0μs -> 18.1μs (32.4% faster)
    codeflash_output = analyzer.byte_to_char_index(1, source) # 681ns -> 611ns (11.5% faster)
    codeflash_output = analyzer.byte_to_char_index(2, source) # 461ns -> 440ns (4.77% faster)
    codeflash_output = analyzer.byte_to_char_index(3, source) # 411ns -> 370ns (11.1% faster)
    codeflash_output = analyzer.byte_to_char_index(4, source) # 390ns -> 371ns (5.12% faster)
    # Second pattern starts at byte 5: chars 0-3 offset by 4 = 4-7
    codeflash_output = analyzer.byte_to_char_index(5, source) # 381ns -> 380ns (0.263% faster)
    codeflash_output = analyzer.byte_to_char_index(9, source) # 371ns -> 381ns (2.62% slower)

def test_performance_many_sequential_calls():
    """Test performance with many sequential binary search calls."""
    analyzer = JavaAnalyzer()
    source = ("你好" * 100).encode("utf8")  # 600 bytes, 200 characters
    # Make 1000 calls (tests binary search efficiency)
    for byte_offset in range(0, 600, 1):
        codeflash_output = analyzer.byte_to_char_index(byte_offset, source); result = codeflash_output # 234μs -> 227μs (2.94% faster)

def test_maximum_character_index_in_large_source():
    """Test that character indices are bounded correctly in large source."""
    analyzer = JavaAnalyzer()
    source = "a" * 500 + "€" * 100  # Mixed sizes
    # Offset at end of source
    codeflash_output = analyzer.byte_to_char_index(len(source.encode("utf8")) - 1, source.encode("utf8")); result = codeflash_output # 62.8μs -> 46.5μs (35.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-21T00.03.51

Click to see suggested changes
Suggested change
for ch in decoded:
append(cum[-1] + len(ch.encode("utf8")))
# Instead of encoding each character separately (which allocates many small byte
# objects), scan the original UTF-8 bytes and determine each code point's byte
# length from the leading byte. This mirrors how UTF-8 encodes code points:
# 0xxxxxxx -> 1 byte, 110xxxxx -> 2 bytes, 1110xxxx -> 3 bytes, 11110xxx -> 4 bytes.
s = source
n = len(s)
i = 0
while i < n:
b = s[i]
# 1-byte (ASCII)
if b < 0x80:
i += 1
# 2-byte sequence
elif (b & 0xE0) == 0xC0:
i += 2
# 3-byte sequence
elif (b & 0xF0) == 0xE0:
i += 3
# 4-byte sequence
elif (b & 0xF8) == 0xF0:
i += 4
else:
# Fallback for unexpected byte values; keep progress to avoid infinite loops.
i += 1
append(i)

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 21, 2026

⚡️ Codeflash found optimizations for this PR

📄 11% (0.11x) speedup for JavaAssertTransformer._find_top_level_arg_node in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 1.52 milliseconds 1.37 milliseconds (best of 39 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 21, 2026

⚡️ Codeflash found optimizations for this PR

📄 41% (0.41x) speedup for JavaAssertTransformer._find_balanced_parens in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 2.56 milliseconds 1.81 milliseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 21, 2026

⚡️ Codeflash found optimizations for this PR

📄 326% (3.26x) speedup for JavaAssertTransformer._find_balanced_braces in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 13.8 milliseconds 3.24 milliseconds (best of 182 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 21, 2026

⚡️ Codeflash found optimizations for this PR

📄 10% (0.10x) speedup for insert_method in codeflash/languages/java/replacement.py

⏱️ Runtime : 6.16 milliseconds 5.60 milliseconds (best of 180 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 21, 2026

⚡️ Codeflash found optimizations for this PR

📄 16% (0.16x) speedup for JavaSupport.add_global_declarations in codeflash/languages/java/support.py

⏱️ Runtime : 194 microseconds 168 microseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch omni-java).

Static Badge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

workflow-modified This PR modifies GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants