Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 1.17 KB

File metadata and controls

26 lines (21 loc) · 1.17 KB

VLM-CPL

Overall Framework

There are three steps to train the classification network with selected high-quality pseudo labels. overall

Data prepare

Download the HPH dataset here
Download the LC25K dataset here
Download the CRC100K dataset here
Download the DigestPath dataset here

Using a 4:1 split for training and testing.

Training process

First, use the on-the-shelf VLM for zero-shot inference with our proposed method to filter out noisy samples on the training set.
In the vlm_cpl_LC25K.py file, there are two main functions, MVCandPrompt_feature_consensus.
You can use the combination of MVCandPrompt_feature_consensus or either one alone. You can also adjust the order of these two filters.

python vlm_cpl_LC25K.py --gpu 0

Second, after obtaining high-quality pseudo-labels, you can train a classification network.

python train_pseudo.py --gpu 0 --pseudo_csv <your_csv>