My experience with local LLM

ntn888@lemmy.ml · 2 months ago

My experience with local LLM

ZoteTheMighty@lemmy.zip · 2 months ago

This weekend I had an LLM walk me through setting up some home server stuff and networking. I tried using Proton’s Lumo and Qwen 3.6 locally. I have to say Qwen was the more impressive of the two models. When I first tried running models locally like llama 4, I remember thinking to myself that this was a dead end and big servers would always have the advantage, but it seems like we’re hitting a turning point where many things can be done locally.

ntn888@lemmy.ml · 2 months ago

cool what was your hardware, and which qwen size you used? thanks

ZoteTheMighty@lemmy.zip · 2 months ago

I have a 24GB AMD 7900XTX, and it’s a 35b parameter model.

ericwdhs@discuss.online · 2 months ago

Ooo… I’m running a 7900 XTX as well. Having 24GB without the Nvidia tax has been super nice for AI stuff. I have a 16GB 6900 XT running in another computer, and a lot of my AI model selection is still sized for it. I may need to stop procrastinating and copy your setup sooner rather than later.

NoiseColor @lemmy.world · 2 months ago

For what stuff do you want to use them? I don’t think they come remotely close to today’s commercial models. Maybe for a specific purpose?

ntn888@lemmy.ml · 2 months ago

hey, thanks for your response… yeah that’s what I meant, the 2b models aren’t usable in today’s state, but more practical for everyday use if they work out…

I actually meant the 31b models are useful for my purpose. I don’t do full-on agentic coding, just interactive chat/prompting. Example, I make good use for making linux shell scripts (as I don’t know howto myself). Currently I use qwen3.5-flash via cloud. It’s as good as the frontier models back then if not better…

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

NoiseColor @lemmy.world · 2 months ago

I wanted to use smaller models, but then do more work on the “thinking” process. I didn’t come far, because it get so slow with normal hardware and too expensive on dedicated one. Time consuming (I’m also not a programmer) but a fun project, but in the end I just decided to satisfy the privacy angle with protons ai Lumo.

inari@piefed.zip · 2 months ago

Proton has AI? Damn, that’s gotta be bleeding their coffers

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

NoiseColor @lemmy.world · 2 months ago

They have been working on this. Only 3 months ago it was pretty terrible. Today it’s almost on par with chatgpt. A bit worse on rag, slower,… good enough for normal use.

☂️-@lemmy.ml · 2 months ago

is it just me or the smaller models that fit in my vram are very dumb?

Samskara@sh.itjust.works · 2 months ago

Do you have 24 GB?

☂️-@lemmy.ml · 2 months ago

not of vram

Samskara@sh.itjust.works · 2 months ago

That’s your issue.

☂️-@lemmy.ml · edit-2 2 months ago

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

☂️-@lemmy.ml · 2 months ago

how are some other ways to make it better beyond just adding a search tool? is 16gb vram sufficient for usable results?

where do you think is the best place to go into this rabbit hole?

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

☂️-@lemmy.ml · 2 months ago

commenting so i can come back to this later

ntn888@lemmy.ml · 2 months ago

I didn;t try any 7b ones lately, they may be better fit for 16gb I think. I was able to try the 2b ones as I mentioned (on cpu). they are subpar. like mentioned the usable ones were 31b, I think you need atleast 24gb vram for most models though. maybe someone else can suggest better.

☂️-@lemmy.ml · 2 months ago

i’m assuming swapping to ram is unusably slow. yeah, bummer.

fozid@feddit.uk · 2 months ago

For me, anything less than gpt oss 20b (a2b) is just for messing around with or for basic categorisation and basic text or data processing with highly structured prompts.

inconel@lemmy.ca · 2 months ago

For small model bonsai series seems getting the spotlight. Natively trained on1bit and ternary 1.58bit, 8B runs on ~1GB memory. I’m curios on local models but haven’t tried because of lack of gaming rig but it seems work enough for regular pc

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator