• SabinStargem@lemmy.today
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 hours ago

    You can use something like KoboldCPP on Linux, which allows both RAM and VRAM combined to run a model. O’course, not as fast when compared to pure VRAM or the Mac approach, but it is an option. I use my 128gb RAM with some GPUs for running models.