I realize, I need to upgrade my little NUC to something bigger for higher inference of bigger llama models. I want something that you still can have on your living room’s tv bench, so no monster rack please, but that has also the necessary muscle when needed for llama. Budget doesn’t matter right now, want to understand what’s good and what’s out there. Thanks
EDIT: Wow, thanks for the inspiration, guess I need to look at bit for “how to stuff a huge graphics card into a mini box”. To clarify a bit more what I want with it: I want to build a responsive personal assistant. I am dreaming of models bigger than 8B, good tool calling for things like memory, websearch etc., no coding, no image generation, no video generation required. Image recognition would be good but not a must. Regarding footprint, the no monster ;) Something that you can have in your livingroom, and could be wife approved - so no big gaming rig with exhaust pipes and stuff, needs to be good looking ;)


Thanks, will also ask in the other group you mentioned. I am still having a gaming rig here with rx6900xt as well but way too big to get it wife approved into the living room and have no man cave to run it 24/7. ;) But maybe good for testing what I actually need in model size, I think it is just 1 generation before all the ai hype took off but going to try now right away.
It’s pretty trivial to make use of an LLM compute box remotely; in fact, most of the software out there is designed around doing this, since lots of people use cloud-based LLM compute machines. I use the Framework Desktop in this fashion — I leave it headless, just as an LLM compute node for whatever machine is running software that needs number-crunching done. So if your gaming machine is fine for you in terms of compute capability, you might want to just use it remotely from the living room with another machine being in the living room.
Another benefit of sticking the compute box elsewhere is that while my Framework Desktop is very quiet (single large fan, about 120W TDP, and is notable for being rather quieter than other AI Max-based systems), keeping my 7900 XTX loaded will spin up the fans. You may not want to have a heavy-duty number-crunching machine in the living room from a noise standpoint.