Article: https://proton.me/blog/deepseek
Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.
I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.
To be fair, most people can’t actually self-host Deepseek, but there already are other providers offering API access to it.
There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.
That’s about as openly available to self host as you can get without a 1-button installer.
You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.
A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost “only” ~$10k rather than 100k+…
I saw posts about people running it well enough for testing purposes on an NVMe.
Can you link that post?
https://old.reddit.com/r/LocalLLaMA/comments/1idseqb/deepseek_r1_671b_over_2_toksec_without_gpu_on/
Thanks!
That’s cool! I’m really interested to know how many tokens per second you can get with a really good U.2. My gut is that it won’t actually be better than the 24VRAM+96RAM cache setup this user already tested with though.
Running R1 locally isn’t realistic. But you can rent a server and run it privately on someone else’s computer. It costs about 10 per hour to run. You can run it on CPU for a little less. You need about 2TB of RAM.
If you want to run it at home, even quantized in 4 bit, you need 20 4090s. And since you can only have 4 per computer for normal desktop mainboards, that’s 5 whole extra computers too, and you need to figure out networking between them. A more realistic setup is probably running it on CPU, with some layers offloaded to 4 GPUs. In that case you’ll need 4 4090s and 512GB of system RAM. Absolutely not cheap or what most people have, but technically still within the top top top end of what you might have on your home computer. And remember this is still the dumb 4 bit configuration.
Edit: I double-checked and 512GB of RAM is unrealistic. In fact anything higher than 192 is unrealistic. (High-end) AM5 mainboards support up to 256GB, but 64GB RAM sticks are much more expensive than 48GB ones. Most people will probably opt for 48GB or lower sticks. You need a Threadripper to be able to use 512GB. Very unlikely for your home computer, but maybe it makes sense with something else you do professionally. In which case you might also have 8 RAM slots. And such a person might then think it’s reasonable to spend 3000 Euro on RAM. If you spent 15K Euro on your home computer, you might be able to run a reduced version of R1 very slowly.
You don’t need that much ram to run this
How much do you need? Show your maths. I looked it up online for my post, and the website said 1747GB, which is completely in-line with other models.
https://apxml.com/posts/gpu-requirements-deepseek-r1