When I first got into local LLMs nearly 3 years ago, in mid 2023, the frontier closed models were ofcourse impressively capable.
I then tried my hand on running 7b size local models, primarily one called Zephyr-7b (what happened to these models?? Dolphin anyone??), on my gaming PC with 8GB AMD RX580 GPU. Fair to say it was just a curiosity exercise (in terms of model performance).
Fast forward to this month, I revisit local LLM. (Although I no longer have the gaming PC, cost-of-living-crisis anyone 😫 )
And, the 31b size models look very sufficient. #Qwen has taken the helm in this order. Which is still very expensive to setup locally, although within grasp.
I’m rooting for the edge-computing models now - the ~2b size models. Due to their low footprint, they are practical to run in a SBC 24/7 at home for many people.
But these edge models are the ‘curiosity category’ now.


For small model bonsai series seems getting the spotlight. Natively trained on1bit and ternary 1.58bit, 8B runs on ~1GB memory. I’m curios on local models but haven’t tried because of lack of gaming rig but it seems work enough for regular pc
deleted by creator