Skip to content

[Bug] Segmentation fault when importing torchmetrics after creating a TVM CUDA target (LLVM initialization / COFF OptTable) #18655

@tinywisdom

Description

@tinywisdom

Summary

Creating a TVM cuda target and then importing an unrelated Python package (torchmetrics) causes an immediate segmentation fault. There is no model compilation or runtime execution involved—simply constructing a target triggers the problem.

The crash occurs inside LLVM initialization, specifically in COFF directive parser global constructors (e.g., llvm::opt::OptTable::buildPrefixChars() and _GLOBAL__sub_I_COFFDirectiveParser.cpp), during dynamic library loading (dlopen).

The behavior suggests a dynamic linking / multiple-LLVM interaction.


Minimal Reproduction

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import tvm
from tvm import target
from torchmetrics import Accuracy   # segmentation fault occurs here in my environment

def main():
    print("Creating CUDA target...")
    tgt = target.Target("cuda -arch=sm_86")
    print("Target created:", tgt)

    metric = Accuracy(task="multiclass", num_classes=10)
    print("Metric created:", metric)

if __name__ == "__main__":
    main()

Actual Behavior

On my machine, the process prints "Creating CUDA target…" and then crashes with segmentation fault during import of torchmetrics. The beginning of the backtrace is:

!!!!!!! Segfault encountered !!!!!!!
  File "<unknown>", in llvm::opt::OptTable::buildPrefixChars()
  File "<unknown>", in COFFOptTable::COFFOptTable()
  File "<unknown>", in _GLOBAL__sub_I_COFFDirectiveParser.cpp
  File "./elf/dl-init.c", in call_init
  File "./elf/dl-open.c", in dl_open_worker
  ...
Segmentation fault (core dumped)

Full trace is long but mostly dlopen / dl-init frames followed by LLVM initialization frames.


Expected Behavior

Importing torchmetrics after TVM target construction should not crash, especially before any compilation or runtime invocation occurs. The two libraries are unrelated and no model is passed to TVM.


Notes on Repro Properties

  • The issue does not require PyTorch, transformers, or CUDA execution.
  • The critical step is:
tgt = target.Target("cuda -arch=sm_86")

followed by importing a package that triggers its own dynamic-library / symbol loading chain.

  • The failure happens even if Accuracy is never called.
  • Removing the TVM target creation avoids the crash.

Environment

OS: Linux x86_64 (glibc-based)
Python: 3.10.16 (conda-forge)
NumPy: 2.2.6
PyTorch: 2.9.0+cu128
Torchmetrics: <version depends on pip/conda>   # fill here if needed
TVM: 0.22.0
  LLVM: 17.0.6 (from tvm.support.libinfo())
  GIT_COMMIT_HASH: 9dbf3f22ff6f44962472f9af310fda368ca85ef2
GPU: sm_86 (Ampere)
TVM target: cuda -keys=cuda,gpu -arch=sm_86 -max_num_threads=1024 -thread_warp_size=32
CUDA toolkit: likely 12.8 (based on PyTorch +cu128 build)

Triage

Please refer to the list of label tags here to find the relevant tags and add them below in a bullet format (example below).

  • needs-triage

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions