[Question] Missing prepare_ovdg_dataset.py and low performance on novel classes (DIOR)

### Prerequisite

- [x] I have searched [Issues](https://github.com/open-mmlab/mmrotate/issues) and [Discussions](https://github.com/open-mmlab/mmrotate/discussions) but cannot get the expected help.
- [x] I have read the [FAQ documentation](https://mmrotate.readthedocs.io/en/1.x/notes/4_faq.html) but cannot get the expected help.
- [ ] The bug has not been fixed in the [latest version (master)](https://github.com/open-mmlab/mmrotate) or [latest version (1.x)](https://github.com/open-mmlab/mmrotate/tree/dev-1.x).

### Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

### Branch

master branch https://github.com/open-mmlab/mmrotate

### Environment

ys.platform: linux
Python: 3.8.20 | packaged by conda-forge | (default, Sep 30 2024, 17:52:49) [GCC 13.3.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3,4,5: NVIDIA A800 80GB PCIe
CUDA_HOME: /usr/local/cuda-11.7/bin/nvcc
GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0
PyTorch: 2.1.0+cu118
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.1-Product Build 20220311 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.5
    - Built with CuDNN 8.7
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.0+cu118
OpenCV: 4.13.0
MMEngine: 0.10.7
MMRotate: 1.0.0rc1+

### Reproduces the problem - code sample

```
import os
import json
import glob
import random
from tqdm import tqdm

def generate_nwpu_unlabeled_json_final():
    # ================= Configuration Area (Updated) =================
    
    # 1. Dataset root directory (the directory containing category folders like airplane, airport, etc.)
    data_root = "NWPU-RESISC45"
    
    # 2. Output filename
    output_json = "annotations/nwpu45_unlabeled_2.json"
    
    # 3. Number of images to sample per class
    shots_per_class = 50 
    
    # 4. Random seed
    seed = 42
    
    # ================================================================

    # 45 class names
    classes = sorted([
        'airplane', 'airport', 'baseball_diamond', 'basketball_court', 'beach', 
        'bridge', 'chaparral', 'church', 'circular_farmland', 'cloud', 
        'commercial_area', 'dense_residential', 'desert', 'forest', 'freeway', 
        'golf_course', 'ground_track_field', 'harbor', 'industrial_area', 
        'intersection', 'island', 'lake', 'meadow', 'medium_residential', 
        'mobile_home_park', 'mountain', 'overpass', 'palace', 'parking_lot', 
        'railway', 'railway_station', 'rectangular_farmland', 'river', 
        'roundabout', 'runway', 'sea_ice', 'ship', 'snowberg', 
        'sparse_residential', 'stadium', 'storage_tank', 'tennis_court', 
        'terrace', 'thermal_power_plant', 'wetland'
    ])

    # Build Class ID mapping
    cat2id = {name: i + 1 for i, name in enumerate(classes)}

    random.seed(seed)
    
    coco_output = {
        "images": [],
        "annotations": [], # Keep empty
        "categories": []
    }

    # Populate Categories
    for name, cid in cat2id.items():
        coco_output["categories"].append({"id": cid, "name": name})

    print(f"Data Source: {data_root}")
    print(f"Output to: {output_json}")
    print(f"Processing: {shots_per_class} images per class...")
    
    img_id_counter = 1
    
    # Check if root directory exists
    if not os.path.exists(data_root):
        raise FileNotFoundError(f"Data directory not found: {data_root}")

    for cat_name in tqdm(classes, desc="Scanning classes"):
        cat_dir = os.path.join(data_root, cat_name)
        
        if not os.path.exists(cat_dir):
            print(f"Warning: Category directory {cat_dir} not found, skipping.")
            continue

        img_files = glob.glob(os.path.join(cat_dir, "*.jpg"))
        random.shuffle(img_files)
        
        selected_imgs = img_files[:shots_per_class]
        current_cat_id = cat2id[cat_name]

        for img_path in selected_imgs:
            # Calculate relative path, generated file_name looks like "airplane/airplane_001.jpg"
            rel_path = os.path.relpath(img_path, data_root)
            
            image_info = {
                "id": img_id_counter,
                "file_name": rel_path,
                "width": 256,
                "height": 256,
                # Key step: Must include category_id here to work with merge_ovdg_preds.py
                "category_id": current_cat_id 
            }
            coco_output["images"].append(image_info)
            img_id_counter += 1

    # Create output directory
    os.makedirs(os.path.dirname(output_json), exist_ok=True)

    # Save
    with open(output_json, 'w') as f:
        json.dump(coco_output, f)

    print(f"\nSuccessfully generated: {output_json}")
    print(f"Total images included: {len(coco_output['images'])}")

if __name__ == "__main__":
    generate_nwpu_unlabeled_json_final()
```

### Reproduces the problem - command or script

```
DEVICES_ID=3

exp1="glip_atss_r50_a_fpn_dyhead_visdronezsd_base"

# Step1: train base-detector
CUDA_VISIBLE_DEVICES=$DEVICES_ID python tools/train.py \
    projects/GLIP/configs/$exp1.py

# Step2.1: pseudo-labeling
exp2="glip_atss_r50_a_fpn_dyhead_visdronezsd_base_nwpu45_pseudo_labeling"
CUDA_VISIBLE_DEVICES=$DEVICES_ID python tools/test.py \
    projects/GLIP/configs/$exp2.py \
    work_dirs/$exp1/iter_20000.pth

# Step2.2: merge predictions
python projects/GroundingDINO/tools/merge_ovdg_preds.py \
    --ann_path data/NWPU-RESISC45/annotations/nwpu45_unlabeled_2.json \
    --pred_path work_dirs/$exp2/nwpu45_pseudo_labeling_2.bbox.json \
    --save_path work_dirs/$exp2/nwpu45_unlabeled_with_glip_pseudos_2.json

cp work_dirs/$exp2/nwpu45_unlabeled_with_glip_pseudos_2.json data/NWPU-RESISC45/annotations/nwpu45_unlabeled_with_glip_pseudos_2.json

# Step3: self-training
exp3="glip_atss_r50_a_fpn_dyhead_visdronezsd_base_nwpu"
CUDA_VISIBLE_DEVICES=$DEVICES_ID python tools/train.py \
    projects/GLIP/configs/$exp3.py

# Step4: test
CUDA_VISIBLE_DEVICES=$DEVICES_ID python tools/test.py \
    projects/GLIP/configs/$exp3.py \
    work_dirs/$exp3/iter_10000.pth \
    --work-dir work_dirs/$exp3/dior_test
```

### Reproduces the problem - error message

+-------------------------+-----+-------+--------+-------+
| class                   | gts | dets  | recall | ap    |
+-------------------------+-----+-------+--------+-------+
| airplane                | 292 | 4115  | 0.928  | 0.814 |
| baseballfield           | 373 | 4478  | 0.922  | 0.846 |
| bridge                  | 381 | 13307 | 0.593  | 0.252 |
| chimney                 | 243 | 2412  | 0.831  | 0.791 |
| dam                     | 155 | 12101 | 0.761  | 0.480 |
| expressway-service-area | 274 | 1000  | 0.785  | 0.686 |
| expressway-toll-station | 188 | 710   | 0.691  | 0.611 |
| golffield               | 176 | 3174  | 0.847  | 0.768 |
| harbor                  | 252 | 6457  | 0.611  | 0.312 |
| overpass                | 307 | 3793  | 0.612  | 0.472 |
| ship                    | 346 | 6537  | 0.766  | 0.538 |
| stadium                 | 184 | 4078  | 0.946  | 0.856 |
| storagetank             | 237 | 1966  | 0.890  | 0.792 |
| tenniscourt             | 293 | 5826  | 0.867  | 0.606 |
| trainstation            | 143 | 2932  | 0.552  | 0.398 |
| vehicle                 | 910 | 12082 | 0.731  | 0.437 |
| airport                 | 160 | 0     | 0.000  | 0.000 |
| basketballcourt         | 232 | 0     | 0.000  | 0.000 |
| groundtrackfield        | 359 | 0     | 0.000  | 0.000 |
| windmill                | 293 | 0     | 0.000  | 0.000 |
+-------------------------+-----+-------+--------+-------+
| mAP                     |     |       |        | 0.483 |
+-------------------------+-----+-------+--------+-------+

### Additional information

First of all, thank you for your impressive work on this project.

I am trying to reproduce the OVD results on the DIOR dataset, but I noticed that the file projects/GroundingDINO/tools/prepare_ovdg_dataset.py is missing from the repository.

My Attempt: To proceed, I implemented a custom script (named generate_nwpu_unlabeled_json_final) based on my understanding to generate the unlabeled JSON file. I followed the rest of the pipeline as described.

The Issue: While the detection for base classes works normally, the model fails to detect novel classes (the scores are extremely low or zero).

My Questions:

Could you please share the original prepare_ovdg_dataset.py file or describe the strict data format it requires? I suspect my custom JSON generation might be causing a misalignment between visual and text features.

Is the NWPU dataset strictly necessary for the model to detect novel classes on DIOR? Or is it used primarily for data augmentation?

Any guidance would be greatly appreciated. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Missing prepare_ovdg_dataset.py and low performance on novel classes (DIOR) #17

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Missing prepare_ovdg_dataset.py and low performance on novel classes (DIOR) #17

Description

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions