Spectrum of singular values for a single layer weight update matrix obtained by merging using Task Arithmetic (top) compared to our approaches: Iso-C (middle) and Iso-CTS (bottom). Task Arithmetic sums the task-specific matrices, which result in a spectrum with a few dominant components. Iso-C instead replaces this spectrum with a uniform one, which results in significant performance improvement. Iso-CTS enhances the common subspace with task-specific subspaces and yields state-of-the-art model merging performance.
Use the checkpoints provided by Task Singular Vectors (which are the same as provided by Tall Masks).
Most datasets being used should be downloaded automatically with torchvision
or huggingface
. For the datasets requiring manual preparation (like Cars, DTD, EuroSAT, SUN397), please follow the instructions in this issue. Depending on the torchvision
version, some issues might arise when downloading specific datasets like here or here. In this case, using a different torchvision
version might solve the issue.
Modify model_location
and data_location
in config/config.yaml
before evaluation.
conda env create
conda activate iso-merging
tldr ✅: Merge by Task Arithmetic (summation) and make the spectrum of singular values uniform.
tldr ✅: Merge by Task Arithmetic (common subspace), replace the least significant singular vectors by task-specific ones (task-specific subspaces) and and make the spectrum of singular values uniform.
model=ViT-B-16
num_tasks=8
# Merge and evaluate Iso-C
python main.py method="iso_c" model=${model} num_tasks=${num_tasks}
# Merge and evaluate Iso-CTS
python main.py method="iso_cts" model=${model} num_tasks=${num_tasks} method.common_space_fraction=0.8
If you find this code useful, please cite the following paper:
@article{marczak2025notaskleftbehind,
title = {{N}o {T}ask {L}eft {B}ehind: {I}sotropic {M}odel {M}erging with {C}ommon and {T}ask-{S}pecific {S}ubspaces},
author = {Daniel Marczak and Simone Magistri and Sebastian Cygert and Bartłomiej Twardowski and Andrew D. Bagdanov and Joost van de Weijer},
year = {2025},
journal = {arXiv preprint arXiv: 2502.04959}
}
Code adapted from Task Singular Vectors and Tall Masks.