Comment on: Liquid AI reveals 8B-A1B MoE trained on 38T

Some of the coding-specific fine-tunes were really impressive boosts. Qwen2.5-3B-Instruct is also available [0] -- if it's not too much to ask, I'd be curious how more general models stack up in your benchmark?

[0] - https://huggingface.co/Qwen/Qwen2.5-3B-Instruct

View full discussion on Hacker News

🗞️ View on Hacker News