Article: https://proton.me/blog/deepseek
Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.
I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.
Pretty rich coming from Proton, who shoved a LLM into their mail client mere months ago.
wait, what? How did I miss that? I use protonmail, and I didn’t see anything about an LLM in the mail client. Nor have I noticed it when I check my mail. Where/how do I find and disable that shit?
Thank you. I’ve saved the link and will be disabling it next time I log in. Can’t fucking escape this AI/LLM bullshit anywhere.
The combination of AI, crypto wallet and CEO’s pro-MAGA comments (all within six months or so!) are why I quit Proton. They’ve completely lost the plot. I just want a reliable email service and file storage.
I’m considering leaving proton too. The two things I really care about are simplelogin and the VPN with port forwarding. As far as I understand it, proton is about the last VPN option you can trust with port forwarding
Once all that crap came out, I felt incredibly justified by never having switched to Proton.
It was entirely out of laziness, but still
I know, I was on a big anti-google crusade and Proton seemed like an easy plug-n-play for a lot of the same services. That’s OK, I’m not really an “all your eggs in one basket” kind of person anyway.
Crypto and AI focus was a weird step before all this came out. But now we know Andy is pro republican… completes a very unappealing picture. We should have a database tho, plenty of c level execs and investor groups do far worse and get no scrutiny simply because they don’t post about it on the internet.
they were also caught praising a nazi party so thats that too
DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.
Any model trained or operated on DeepSeek’s servers is still subject to Chinese data laws, meaning that the Chinese government can demand access at any time.
What??? Whoever wrote this sounds like he has 0 understanding of how it works. There is no “more privacy-friendly version” that could be developed, the models are already out and you can run the entire model 100% locally. That’s as privacy-friendly as it gets.
“Any model trained or operated on DeepSeek’s servers are still subject to Chinese data laws”
Operated, yes. Trained, no. The model is MIT licensed, China has nothing on you when you run it yourself. I expect better from a company whose whole business is on privacy.
To be fair, most people can’t actually self-host Deepseek, but there already are other providers offering API access to it.
There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.
That’s about as openly available to self host as you can get without a 1-button installer.
You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.
A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost “only” ~$10k rather than 100k+…
I saw posts about people running it well enough for testing purposes on an NVMe.
Can you link that post?
That’s cool! I’m really interested to know how many tokens per second you can get with a really good U.2. My gut is that it won’t actually be better than the 24VRAM+96RAM cache setup this user already tested with though.
Thanks!
Running R1 locally isn’t realistic. But you can rent a server and run it privately on someone else’s computer. It costs about 10 per hour to run. You can run it on CPU for a little less. You need about 2TB of RAM.
If you want to run it at home, even quantized in 4 bit, you need 20 4090s. And since you can only have 4 per computer for normal desktop mainboards, that’s 5 whole extra computers too, and you need to figure out networking between them. A more realistic setup is probably running it on CPU, with some layers offloaded to 4 GPUs. In that case you’ll need 4 4090s and 512GB of system RAM. Absolutely not cheap or what most people have, but technically still within the top top top end of what you might have on your home computer. And remember this is still the dumb 4 bit configuration.
Edit: I double-checked and 512GB of RAM is unrealistic. In fact anything higher than 192 is unrealistic. (High-end) AM5 mainboards support up to 256GB, but 64GB RAM sticks are much more expensive than 48GB ones. Most people will probably opt for 48GB or lower sticks. You need a Threadripper to be able to use 512GB. Very unlikely for your home computer, but maybe it makes sense with something else you do professionally. In which case you might also have 8 RAM slots. And such a person might then think it’s reasonable to spend 3000 Euro on RAM. If you spent 15K Euro on your home computer, you might be able to run a reduced version of R1 very slowly.
You don’t need that much ram to run this
How much do you need? Show your maths. I looked it up online for my post, and the website said 1747GB, which is completely in-line with other models.
What??? Whoever wrote this sounds like he has 0 understanding of how it works. There is no “more privacy-friendly version” that could be developed, the models are already out and you can run the entire model 100% locally. That’s as privacy-friendly as it gets.
Unfortunately it is you who have 0 understanding of it. Read my comment below. Tldr: good luck to have the hardware
I understand it well. It’s still relevant to mention that you can run the distilled models on consumer hardware if you really care about privacy. 8GB+ VRAM isn’t crazy, especially if you have a ton of unified memory on macbooks or some Windows laptops releasing this year that have 64+GB unified memory. There are also websites re-hosting various versions of Deepseek like Huggingface hosting the 32B model which is good enough for most people.
Instead, the article is written like there is literally no way to use Deepseek privately, which is literally wrong.
So I’ve been interested in running one locally but honestly I’m pretty confused what model I should be using. I have a laptop with a 3070 mobile in it. What model should I be going after?
as I said in my original comment, it’s not only VRAM that matters.
I honestly doubt that even gamer laptops can run these models with a usable speed, but even if we add up the people who have such a laptop, and those who have a PC powerful enough to run these models, they are tiny fractions of those that use the internet on the world. it is basically not available to those that want to use it. ot is available to some of them, but not nearly all who may want it
Is it Open Source? I cannot find the source code. The official repository https://github.com/deepseek-ai/DeepSeek-R1 only contains images, a PDF file, and links to download the model. But I don’t see any code. What exactly is Open Source here?
I don’t see the source either. Fair cop.
Thanks for confirmation. I made a top level comment too, because this important information gets lost in the comment hierarchy here.
Open source is in general wrong term in all of these “open source” LLM’s (like LLAMA and R1), the model is shared, but there is no real way of reproducing the model. This is because the training data is never shared.
In my mind open source means that you can reproduce the same binary from source. The models are shared for free, but not “open”.
There are already other providers like Deepinfra offering DeepSeek. So while the the average person (like me) couldn’t run it themselves, they do have alternative options.
which probably also collects and keeps everything you say in the chat. just look in ublock origin’s expanded view to see their approach to privacy, by having a look at all the shit they are pushing to your browser
Obviously you need lots of GPUs to run large deep learning models. I don’t see how that’s a fault of the developers and researchers, it’s just a fact of this technology.
and that is not what I was complaining about
Down votes be damned, you are right to call out the parent they clearly dont articulate their point in a way that confirms they actually understand what is going on and how an open source model can still have privacy implications if the masses use the company’s hosted version.
People got flack for saying Proton is the CIA, Proton is NSA, Proton is a joint five-eyes country intelligence operation despite the convenient timing of their formation and lots of other things.
Maybe they’re not, maybe their CEO is just acting this way.
But consider for a moment if they were. IF they were then all of this would make more sense. The CIA/NSA/etc have a vested interest in discrediting and attacking Chinese technology they have no ability to spy or gather data through. The CIA/NSA could also for example see a point to throwing in publicly with Trump as part of a larger agreed upon push with the tech companies towards reactionary politics, towards what many call fascism or fascism-ish.
My mind is not made up. It’s kind of unknowable. I think they’re suspicious enough to be wary of trusting them but there’s no smoking gun, yet there wasn’t a smoking gun that CryptoAG was a CIA cut-out until some unauthorized leaks nearly a half century after they gained control and use of it. We know they have an interest in subverting encryption, in going fishing among “interesting” targets who might seek to use privacy-conscious services and among dissidents outside the west they may wish to vet and recruit.
True privacy advocates should not be throwing in with the agenda of any regime or bloc, especially those who so trample human and privacy rights as that of the US and co. They should be roundly suspicious of all power.
In other words, honeypot. And an US plant in Switzerland…
deleted by creator
OpenAI, Google, and Meta, for example, can push back against most excessive government demands.
Sure they “can” but do they?
Why do that when you can just score a deal with the government to give them whatever information they want for sweet perks like foreign competitors getting banned?
“Pushing back against the government” doesn’t even make sense. These people are oligarchs. They largely are the government. Who attended Trump’s inauguration? Who hosted Trump’s inauguration party? These US tech oligarchs.
They cannot. When big daddy FBI knocks on the door and you get that forced NDA you, will build in backdoors and comply with anything the US government tells you.
Even then the US might want to you to shut down because they want to control your company.
TikTok.
It’s simple:
bad.
Well you just made me choke on my laughter. Well done, well done.
🤣
deleted by creator
Since ditching Proton for Tuta and Mailbox…I haven’t missed anything and I’m saving money.
I got a proton vpn subscription a while ago and they upgraded me to unlimited for the same price. So I think I’m paying like $6.25/month for an unlimited plan. I feel like it’s too good to leave. If I do tuta’s plan that’s $3, then another $4 for simplelogin, and $5 for mullvad. So that’s $12 a month if I leave my plan.
I get it, and please, you do you. There’s no issue.
I’d just add that I can save money using Amazon, but I try to avoid it when I can. I’ll pay a little extra when I can, for the greater good.
You have two email addresses in both Tuta and Mailbox? Any particular reason for that, that you could share with us? 🙏
I have two domains, one in each of Tuta and Mailbox. It was originally so I could try both out, but now I figure it doesn’t hurt to keep 'em separated. I’m still new to non-proton so I am sort of still feeling things out.
Nothing really too interesting or tricky about it, just bred out of curiosity.
Ah I see. So now to the possibly tough question, if you had to choose only one, or recommend only one of them to someone who wants to make a minimal amount of new email addresses, which one would you recommend over the other? 😅 Or maybe a third option?
I think I’d need some more time to really answer, but on the outset, I find Mailbox.org’s interface more intuitive with more settings and generally feels cleaner and more streamlined. Creating aliases and domain aliases in mailbox seems more proton-like in its simplicity.
Tuta I think is more private and secure, but bits of their interface and app need polish. One reason I think Tuta is more secure despite them both touting security and privacy is that Mailbox search works immediately, whereas Tuta requires you to agree to a permission and states it stores everything locally to you so it may take up space. I think Tuta isn’t doing any server-side indexing of any kind? Unsure.
edit: Mailbox doesn’t have a native app, and Tuta has a native app but I think it’s largely a webview. Notifications work OK but you’ll click on a notification and then have to wait for the app to actually connect and resync before you can view it.
Just following up on this, I stumbled on this: https://tuta.com/mailbox
I think it might help. I could definitely see that depending on your use case, Mailbox may be a better choice. I think for general privacy they’re both good, with Tuta having a few “a step above” offerings security-wise but maybe not necessary for most users.
To be fair its correct but it’s poor writing to skip the self hosted component. These articles target the company not the model.
Goddammit I had such high hopes for Proton. Was planning on that being my post-Google main. Now what. 💀
deleted by creator
Anything European-based to recommend? I’d like something as far-removed from America as possible, respecting GDPR, privacy, etc., but with a good-sized free-tier storage. I don’t think I need more than a couple GB for email. Calendar included would be a big plus as well. 😅 Probably asking for a lot here…
I use Infomaniak Mail or ikmail for short. They give you 20GB free, have a whole suite (calendar and others), and are Swiss based. It can also link to other mail clients under the free tier. Only hurdle is using a VPN or proxy for initial sign up, but that can be turned off for daily usage.
Tutanota is gdpr but only 1GB free storage. They do offer calendar for free as well with open sourced apps.
Thanks! I saw Tuta from the previous comment and thought 1 GB is a bit on the small side, kind of like Proton. But not too expensive to go up a tier either. 👍
I found this while searching on my own. Might help someone else. 🤷♂️
honestly probably worth paying for for something if it means enough to you
I’ve been happy with Fastmail for 10 years, though they’re Australian and not European. Might look into a European alternative at some point but so far I’ve had no reason to switch.
Jesus fuckin Christ, just marry Trump at this point, Mister proton CEO.
I want to preface this question by saying that I’m not trolling and I’m not defending Proton. I’m genuinely confused at the reaction to this article.
I’m also upset with Proton’s recent comments, specifically the December tweet and subsequent responses, and I’m evaluating my use of Proton.
Near as I can tell, this article (which I did read) lays out the facts about Deepseek as an LLM originating in China and the implications of that.
Why is this article a reason to pile on proton?
Proton had a reputation for being the good guy. In the span of a month, we saw them bend the knee, flip flop and throw shade at competition; all while pretending to be the hero. We essentially have to trust them with our data and they are showing signs that they are willing to act against that trust with worrisome agendas and biases. It’s not a good look, and since this marketing to users key issues, it’s going to cause some responses.
That’s fair. I suppose people will have their pitchforks and will pile on anything at this point
Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.
The reason a very small subset of users love it*
All the downloads making it the top app in the app stores are from people using their centralized service. The people behind these downloads have no clue that you can run it locally or can even start to understand what that would even mean. It is this usage the article is addressing.
Like the thread starter, I am also confused to why this in particular draws so much hate.
deleted by creator
I remember mullvad has less servers than proton and I hear they get blacklisted often. Have you encountered anything like this?
I’ve been “blocked by network security” on reddit. Switching to the next server resolves the issue.
Same here with proton
deleted by creator
deleted by creator
I think maybe Multihop is the Mullvad equivalent?
this is obviously talking about their web app, which most people will be using. In this special instance, it was clearly not the LLM itself censoring the Tiananmen Square, but a layer on top.
i have not bothered downloading and asking deepseek about Tiananmen Square. so i cannot know what the model would have generated. however, it is possible that certain biasses are trained into any model.
i am pretty sure, this blog is aimed at the average user. while i wouldn’t trust any LLM company with my data, i certainly wouldn’t want the chinese government to have them. anyone that knows how to use (ollama)[https://github.com/ollama/ollama] should know these telemetry data don’t apply to running locally. but for sure, pointing it out in the blog would help.
How is this Open Source? The official repository https://github.com/deepseek-ai/DeepSeek-R1 contains images only, a PDF file, and links to download the model. I don’t see any code. What exactly is Open Source here? And if so, where to get the source code?
Open-Source in AI usually posted to HuggingFace instead of GitHub: https://huggingface.co/deepseek-ai/DeepSeek-R1
In deep learning generally open source doesn’t include actual training or inference code. Rather it means they publish the model weights and parameters (necessary to run it locally/on your own hardware) and publish academic papers explaining how the model was trained. I’m sure Stallman disagrees but from the standpoint of deep learning research DeepSeek definitely qualifies as an “open source model”
Just because they call it Open Source does not make it. DeepSeek is not Open Source, it only provides model weights and parameters, not any source code and training data. I still don’t know whats in the model and we only get “binary” data, not any source code. This is not Libre software.
There is a nice (even if by now already a bit outdated) analysis about the openness of different “open source” generative AI projects in the following article: Liesenfeld, Andreas, and Mark Dingemanse. “Rethinking open source generative AI: open washing and the EU AI Act.” The 2024 ACM Conference on Fairness, Accountability, and Transparency. 2024.
So “Open Source” to AI is just releasing a .psd file used to export a jpeg, and you need some other proprietary software like Photoshop in order to use it.
What other proprietary software is necessary to use model weights?
There are many llms you can use offline
Including DeepSeek: https://huggingface.co/deepseek-ai
deleted by creator