This repository contains:
- a reproduction of the CheXNet DenseNet-121 model on the NIH ChestX-ray14 dataset, and
- an extension (DACNet) designed to improve performance under class imbalance.
It also includes a Vision Transformer baseline and a Streamlit demo for inference.
This work reproduces key components of:
CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning
Rajpurkar et al., 2017 (arXiv:1711.05225)
- DenseNet-121 architecture
- Multi-label classification (14 thoracic diseases)
- Patient-level dataset splitting
- Evaluation using AUC-ROC and F1 score
- No access to the original expert-labeled test set
- Labels are based on the publicly available NIH dataset
- Some implementation details are inferred due to lack of official code
DacNet/
├── scripts/
│ ├── replicate_chexnet.py
│ ├── Dacnet.py
│ ├── vit_transformer.py
├── XRay_app/
├── reproducibility/
│ └── test_run.sh
├── project_EDA.ipynb
├── requirements.txt
This project uses the NIH ChestX-ray14 dataset.
- ~112,000 frontal chest X-rays
- 30,805 patients
- 14 disease labels
Download from: https://www.kaggle.com/datasets/nih-chest-xrays/data
Your directory must match what the scripts expect.
Example structure:
data/
├── Data_Entry_2017.csv
├── images_001/
├── images_002/
...
├── images_012/
If your dataset is stored elsewhere, update the path in the scripts:
CONFIG["data_dir"] = "/path/to/your/data"We provide a Docker environment to improve reproducibility.
docker build -t dacnet-env .docker run --gpus all -it --rm -v $(pwd):/workspace/DacNet dacnet-envNotes:
- Requires CUDA-compatible GPU
- ≥8GB VRAM recommended
- If no GPU, remove
--gpus all
Before running full training, verify the setup:
bash reproducibility/test_run.shThis checks:
- dataset loading
- model initialization
- training loop execution
If this step fails, verify dataset paths and file structure before proceeding to full training.
Once inside the Docker container, you can execute the training scripts.
python scripts/replicate_chexnet.pypython scripts/Dacnet.pypython scripts/vit_transformer.pyIf you encounter CUDA memory errors, reduce batch size in the scripts.
Evaluation results (AUC and F1) will print to the console. If WANDB is configured, it will log automatically; otherwise, the container defaults to offline mode. The best model checkpoint will be saved in a models/<run_id> folder.
DenseNet-121 reproduction of CheXNet.
- Multi-label classification (14 diseases)
- BCEWithLogitsLoss
- Patient-level split
- Evaluated using AUC and F1
Improved CNN designed for class imbalance.
- DenseNet-121 backbone
- Focal Loss
- Per-class threshold tuning
- Improved F1 performance
Transformer-based baseline.
- ViT-Base architecture
- Multi-label classification
- Compared against CNN approaches
Performance (Test AUC per Disease)
| Pathology | original CheXNet | Dacnet.py | vit_transformer.py | replicate_chexnet.py |
|---|---|---|---|---|
| Atelectasis | 0.8094 | 0.817 | 0.774 | 0.762 |
| Cardiomegaly | 0.9248 | 0.932 | 0.89 | 0.922 |
| Consolidation | 0.7901 | 0.783 | 0.789 | 0.746 |
| Edema | 0.8878 | 0.896 | 0.876 | 0.864 |
| Effusion | 0.8638 | 0.905 | 0.857 | 0.883 |
| Emphysema | 0.9371 | 0.963 | 0.828 | 0.85 |
| Fibrosis | 0.8047 | 0.814 | 0.772 | 0.766 |
| Hernia | 0.9164 | 0.997 | 0.872 | 0.925 |
| Infiltration | 0.7345 | 0.708 | 0.7 | 0.673 |
| Mass | 0.8676 | 0.919 | 0.783 | 0.824 |
| Nodule | 0.7802 | 0.789 | 0.673 | 0.646 |
| Pleural Thickening | 0.8062 | 0.801 | 0.766 | 0.756 |
| Pneumonia | 0.768 | 0.74 | 0.713 | 0.656 |
| Pneumothorax | 0.8887 | 0.875 | 0.821 | 0.827 |
| Metric | DacNet | ViT Transformer | Replicate CheXNet |
|---|---|---|---|
| Loss | 0.0416 | 0.1589 | 0.1661 |
| AUC | 0.8527 | 0.7940 | 0.7928 |
| F1 | 0.3861 | 0.1114 | 0.0763 |
| Disease | DacNet | ViT Transformer | Replicate CheXNet |
|---|---|---|---|
| AVERAGE | 0.386 | 0.111 | 0.076 |
| Atelectasis | 0.421 | 0.127 | 0.026 |
| Cardiomegaly | 0.532 | 0.264 | 0.423 |
| Consolidation | 0.226 | 0 | 0 |
| Edema | 0.286 | 0.004 | 0 |
| Effusion | 0.623 | 0.427 | 0.459 |
| Emphysema | 0.516 | 0.079 | 0 |
| Fibrosis | 0.127 | 0 | 0 |
| Hernia | 0.750 | 0 | 0 |
| Infiltration | 0.395 | 0.193 | 0.061 |
| Mass | 0.477 | 0.213 | 0.079 |
| Nodule | 0.352 | 0.041 | 0 |
| Pleural Thickening | 0.258 | 0 | 0 |
| Pneumonia | 0.082 | 0 | 0 |
| Pneumothorax | 0.360 | 0.211 | 0.021 |
Folder that contains labeled chest X-ray PNG files for the user to easily download and test on the Hugging Face Streamlit app.
Jupyter Notebook that conducts Exploratory Data Analysis such as how many different diseases are present in the dataset and the proportion of each disease in the dataset.
- Data Preprocessing: Resize, normalize, and augment X-ray images
- Model Selection: CNN and Transformer variants
- Training & Validation: Patient-level splits, loss monitoring
- Evaluation: Per-class AUC-ROC and F1 metrics
- Comparison: Benchmarks against original CheXNet results
Try the model here:
https://huggingface.co/spaces/cfgpp/DACNet
Features:
- image upload
- multi-label prediction
- Grad-CAM visualization
- Results rely on NIH labels (not expert annotations)
- Some implementation details are inferred
- Performance may vary across environments
- Dataset path must be manually configured
If you use this repository:
An Open-Source Reproduction and Enhancement of CheXNet for Chest X-ray Disease Classification
https://arxiv.org/abs/2505.06646
Rajpurkar et al.
CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning
https://arxiv.org/abs/1711.05225