When I first got into local LLMs nearly 3 years ago, in mid 2023, the frontier closed models were ofcourse impressively capable.

I then tried my hand on running 7b size local models, primarily one called Zephyr-7b (what happened to these models?? Dolphin anyone??), on my gaming PC with 8GB AMD RX580 GPU. Fair to say it was just a curiosity exercise (in terms of model performance).

Fast forward to this month, I revisit local LLM. (Although I no longer have the gaming PC, cost-of-living-crisis anyone 😫 )

And, the 31b size models look very sufficient. #Qwen has taken the helm in this order. Which is still very expensive to setup locally, although within grasp.

I’m rooting for the edge-computing models now - the ~2b size models. Due to their low footprint, they are practical to run in a SBC 24/7 at home for many people.

But these edge models are the ‘curiosity category’ now.

  • ZoteTheMighty@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    14 days ago

    This weekend I had an LLM walk me through setting up some home server stuff and networking. I tried using Proton’s Lumo and Qwen 3.6 locally. I have to say Qwen was the more impressive of the two models. When I first tried running models locally like llama 4, I remember thinking to myself that this was a dead end and big servers would always have the advantage, but it seems like we’re hitting a turning point where many things can be done locally.

    • ntn888@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      14 days ago

      cool what was your hardware, and which qwen size you used? thanks

        • ericwdhs@discuss.online
          link
          fedilink
          English
          arrow-up
          0
          ·
          14 days ago

          Ooo… I’m running a 7900 XTX as well. Having 24GB without the Nvidia tax has been super nice for AI stuff. I have a 16GB 6900 XT running in another computer, and a lot of my AI model selection is still sized for it. I may need to stop procrastinating and copy your setup sooner rather than later.

  • NoiseColor @lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    15 days ago

    For what stuff do you want to use them? I don’t think they come remotely close to today’s commercial models. Maybe for a specific purpose?

    • ntn888@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      15 days ago

      hey, thanks for your response… yeah that’s what I meant, the 2b models aren’t usable in today’s state, but more practical for everyday use if they work out…

      I actually meant the 31b models are useful for my purpose. I don’t do full-on agentic coding, just interactive chat/prompting. Example, I make good use for making linux shell scripts (as I don’t know howto myself). Currently I use qwen3.5-flash via cloud. It’s as good as the frontier models back then if not better…

      • NoiseColor @lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        15 days ago

        I wanted to use smaller models, but then do more work on the “thinking” process. I didn’t come far, because it get so slow with normal hardware and too expensive on dedicated one. Time consuming (I’m also not a programmer) but a fun project, but in the end I just decided to satisfy the privacy angle with protons ai Lumo.

  • ☂️-@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    15 days ago

    is it just me or the smaller models that fit in my vram are very dumb?

  • fozid@feddit.uk
    link
    fedilink
    English
    arrow-up
    0
    ·
    15 days ago

    For me, anything less than gpt oss 20b (a2b) is just for messing around with or for basic categorisation and basic text or data processing with highly structured prompts.

  • inconel@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    15 days ago

    For small model bonsai series seems getting the spotlight. Natively trained on1bit and ternary 1.58bit, 8B runs on ~1GB memory. I’m curios on local models but haven’t tried because of lack of gaming rig but it seems work enough for regular pc