Yeah, well Alibaba nearly (and sometimes) beat GPT-4 with a comparatively microscopic model you can run on a desktop. And released a whole series of them. For free! With a tiny fraction of the GPUs any of the American trainers have.
Bigger is not better, but OpenAI has also just lost their creative edge, and all Altman’s talk about scaling up training with trillions of dollars is a massive con.
o1 is kind of a joke, CoT and reflection strategies have been known for awhile. You can do it for free youself, to an extent, and some models have tried to finetune this in: https://github.com/codelion/optillm
But one sad thing OpenAI has seemingly accomplished is to “salt” the open LLM space. Theres way less hacky experimentation going on than there used to be, which makes me sad, as many of its “old” innovations still run circles around OpenAI.
Yeah, well Alibaba nearly (and sometimes) beat GPT-4 with a comparatively microscopic model you can run on a desktop. And released a whole series of them. For free! With a tiny fraction of the GPUs any of the American trainers have.
Bigger is not better, but OpenAI has also just lost their creative edge, and all Altman’s talk about scaling up training with trillions of dollars is a massive con.
o1 is kind of a joke, CoT and reflection strategies have been known for awhile. You can do it for free youself, to an extent, and some models have tried to finetune this in: https://github.com/codelion/optillm
But one sad thing OpenAI has seemingly accomplished is to “salt” the open LLM space. Theres way less hacky experimentation going on than there used to be, which makes me sad, as many of its “old” innovations still run circles around OpenAI.
… “Alibaba (LLM)” … is it this ? … ?
Qwen2.5: A Party of Foundation Models!
https://qwenlm.github.io/blog/qwen2.5/
BTW, as I wrote that post, Qwen 32B coder came out.
Now a single 3090 can beat GPT-4o, and do it way faster! In coding, specifically.
Great news 😁🥂, someone should make a new post on this !
Yep.
32B fits on a “consumer” 3090, and I use it every day.
72B will fit neatly on 2025 APUs, though we may have an even better update by then.
I’ve been using local llms for a while, but Qwen 2.5, specifically 32B and up, really feels like an inflection point to me.