Research methodology for systematic evaluation of ternary quantization on CNNs.
Standard ImageNet stems (7×7 stride-2 + maxpool) destroy spatial information on 32×32 images.
Solution: CIFAR-adapted stem (3×3 stride-1, no maxpool) preserves 32×32 → 32×32 resolution.
Validation: Recovers +6-17 percentage points on CIFAR-10/100, matching published baselines.
Establish proper FP32 baselines with CIFAR-adapted stems.
- 2 models × 3 datasets × 3 seeds
- Recipe: 300 epochs, SGD, cosine schedule, warmup 5 epochs
- Augmentation: mixup/smoothing for CIFAR-10/Tiny-ImageNet only
Isolate KD benefit from quantization penalty (critical baseline for reviewers).
Establish ternary quantization gaps with strong training recipe.
Full recipe: FP32 conv1 + ternary elsewhere (no KD after discovering failure mode).
Increase n=3 to n=10 for near-parity claims on CIFAR-100 and Tiny-ImageNet.
Compare against Trained Ternary Quantization under matched conditions.
- Conv1 dominates: 30-74% of recoverable accuracy despite 0.08% of parameters
- KD failure: Degrades ternary networks (-0.9% to -3.1%), benefits FP32 (+0.9% to +1.6%)
- Recipe effectiveness: FP32 conv1 achieves 1.0% gap on CIFAR-10 without KD
# Aggregate 153 experiments → CSV
uv run python -m analysis.aggregate_results
# Generate paper tables (LaTeX)
uv run python -m analysis.generate_tables
# Generate paper figures (PDF)
uv run python -m analysis.generate_figures
# Compile paper
cd paper && makeAll tables and figures are programmatically generated from results/processed/aggregated.csv.