32 GB VRAM for less $1k sounds like a steal these days, and I’m sure it’s not getting cheaper any time soon.
Does anyone here use this GPU? Or any recent Arc Pros? I basically want someone to talk me out of driving to the nearest place that has it in stock and getting $1k poorer.


I find llama.cpp with Vulkan EXTREMELY reliable. I can have it running for days at once without a problem. As far as tokens/sec that’s that’s a complicated question because it depends on model, quant, sepculative, kv quant, context length, and card distribution. Generally:
Models’ typical speeds at deep context for agentic use. Simple chats will be faster
that’s sick, thanks for sharing