Comment on: Liquid AI reveals 8B-A1B MoE trained on 38T
Some of the coding-specific fine-tunes were really impressive boosts. Qwen2.5-3B-Instruct is also available [0] -- if it's not too much to ask, I'd be curious how more general models stack up in your benchmark?