cv-multitask-loss
Description
CV Multi-Task Loss Combination Strategy Design
Research Question
Design a novel multi-task loss combination strategy for jointly training fine-grained (100-class) and coarse (20-superclass) classification on CIFAR-100 that maximizes fine-class test accuracy.
Background
CIFAR-100 contains 100 fine classes organized into 20 coarse superclasses. Training a model with two classification heads (fine + coarse) provides a natural multi-task learning setup where the coarse task acts as an auxiliary signal. The key challenge is how to combine the two losses effectively.
Classic approaches include:
- Equal weighting: Simply sum the losses (baseline default)
- Uncertainty weighting (Kendall et al., 2018): Learn task-specific uncertainty as log-variance parameters
- Dynamic Weight Average (Liu et al., 2019): Weight tasks by their relative loss change rate
- PCGrad (Yu et al., NeurIPS 2020): Project conflicting task gradients onto each other's normal plane to reduce gradient interference
The coarse labels encode semantic hierarchy. The task is to balance the auxiliary coarse signal against the primary fine-class objective across different architectures and training stages.
What You Can Modify
The MultiTaskLoss class (lines 195-216) in custom_mtl.py. This class receives individual task losses and must combine them into a single scalar loss.
You can modify:
- The
__init__method: add learnable parameters (log-variances, weights, etc.) - The
forwardmethod: implement any combination strategy - Use
epochandtotal_epochsfor curriculum/scheduling approaches - Add any auxiliary state (e.g., loss history buffers)
The forward method receives:
fine_loss: scalar tensor, cross-entropy for 100-class fine predictioncoarse_loss: scalar tensor, cross-entropy for 20-class coarse predictionepoch: int, current epoch (0-indexed)total_epochs: int, total number of training epochs
Note: The MultiTaskLoss parameters are included in the optimizer, so learnable parameters will be trained.
Evaluation
- Metric: Best fine-class test accuracy (%, higher is better)
- Architectures (all on CIFAR-100 with fine+coarse heads):
- ResNet-20 (shallow residual network)
- ResNet-56 (deeper residual network)
- VGG-16-BN (deep non-residual with BatchNorm) — hidden, evaluated on final submission only
- Training: SGD (lr=0.1, momentum=0.9, wd=5e-4), cosine annealing, 200 epochs
- Data augmentation: RandomCrop(32, pad=4) + RandomHorizontalFlip
Code
1"""CV Multi-Task Loss Benchmark.23Train vision models (ResNet, VGG) on CIFAR-100 with TWO classification heads4(fine: 100 classes, coarse: 20 superclasses) to evaluate multi-task loss5combination strategies.67FIXED: Model architectures, data pipeline, training loop.8EDITABLE: MultiTaskLoss class.910Usage:11python custom_mtl.py --arch resnet20 --seed 4212"""1314import argparse15import math
Results
| Model | Type | test acc resnet20-cifar100mt ↑ | test acc resnet56-cifar100mt ↑ | test acc vgg16bn-cifar100mt ↑ |
|---|---|---|---|---|
| dwa | baseline | 67.960 | 72.390 | 73.780 |
| pcgrad | baseline | 64.310 | 70.200 | 74.170 |
| uncertainty | baseline | 66.810 | 70.940 | 72.670 |
| anthropic/claude-opus-4.6 | vanilla | 68.600 | 72.660 | 73.800 |
| deepseek-reasoner | vanilla | 66.550 | 71.760 | 72.290 |
| google/gemini-3.1-pro-preview | vanilla | 68.480 | 71.080 | 74.080 |
| openai/gpt-5.4 | vanilla | 69.110 | 72.120 | 74.360 |
| qwen/qwen3.6-plus | vanilla | 68.290 | 72.380 | 73.460 |
| anthropic/claude-opus-4.6 | agent | 68.600 | 72.660 | 73.800 |
| deepseek-reasoner | agent | 67.100 | 71.900 | 72.750 |
| google/gemini-3.1-pro-preview | agent | 68.870 | 72.640 | 74.550 |
| openai/gpt-5.4 | agent | 68.960 | 72.370 | 74.310 |
| qwen/qwen3.6-plus | agent | 68.710 | 72.700 | 74.010 |