The Small Model Cliff
Source: F5
Published:
<p>May's CASI run added 12 new models, including eight Qwen3.5 variants from 0.8B to 397B parameters and two new Gemma 4 entries. The two smallest Qwen3.5 variants had a CASI Score of under CASI 2.0; the largest scored 92.37. The test suite gained a new attack, BiasJailbreak, which uses safety align