Article: https://proton.me/blog/deepseek

Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.

I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.

  • @simple@lemm.ee
    link
    fedilink
    English
    824 months ago

    DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.

    Any model trained or operated on DeepSeek’s servers is still subject to Chinese data laws, meaning that the Chinese government can demand access at any time.

    What??? Whoever wrote this sounds like he has 0 understanding of how it works. There is no “more privacy-friendly version” that could be developed, the models are already out and you can run the entire model 100% locally. That’s as privacy-friendly as it gets.

    “Any model trained or operated on DeepSeek’s servers are still subject to Chinese data laws”

    Operated, yes. Trained, no. The model is MIT licensed, China has nothing on you when you run it yourself. I expect better from a company whose whole business is on privacy.

    • @lily33@lemm.ee
      link
      fedilink
      244 months ago

      To be fair, most people can’t actually self-host Deepseek, but there already are other providers offering API access to it.

      • @halcyoncmdr@lemmy.world
        link
        fedilink
        English
        244 months ago

        There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.

        That’s about as openly available to self host as you can get without a 1-button installer.

        • @tekato@lemmy.world
          link
          fedilink
          144 months ago

          You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.

        • @Aria@lemmygrad.ml
          link
          fedilink
          1
          edit-2
          4 months ago

          Running R1 locally isn’t realistic. But you can rent a server and run it privately on someone else’s computer. It costs about 10 per hour to run. You can run it on CPU for a little less. You need about 2TB of RAM.

          If you want to run it at home, even quantized in 4 bit, you need 20 4090s. And since you can only have 4 per computer for normal desktop mainboards, that’s 5 whole extra computers too, and you need to figure out networking between them. A more realistic setup is probably running it on CPU, with some layers offloaded to 4 GPUs. In that case you’ll need 4 4090s and 512GB of system RAM. Absolutely not cheap or what most people have, but technically still within the top top top end of what you might have on your home computer. And remember this is still the dumb 4 bit configuration.

          Edit: I double-checked and 512GB of RAM is unrealistic. In fact anything higher than 192 is unrealistic. (High-end) AM5 mainboards support up to 256GB, but 64GB RAM sticks are much more expensive than 48GB ones. Most people will probably opt for 48GB or lower sticks. You need a Threadripper to be able to use 512GB. Very unlikely for your home computer, but maybe it makes sense with something else you do professionally. In which case you might also have 8 RAM slots. And such a person might then think it’s reasonable to spend 3000 Euro on RAM. If you spent 15K Euro on your home computer, you might be able to run a reduced version of R1 very slowly.

    • @ReversalHatchery@beehaw.org
      link
      fedilink
      English
      -84 months ago

      What??? Whoever wrote this sounds like he has 0 understanding of how it works. There is no “more privacy-friendly version” that could be developed, the models are already out and you can run the entire model 100% locally. That’s as privacy-friendly as it gets.

      Unfortunately it is you who have 0 understanding of it. Read my comment below. Tldr: good luck to have the hardware

      • @simple@lemm.ee
        link
        fedilink
        English
        9
        edit-2
        4 months ago

        I understand it well. It’s still relevant to mention that you can run the distilled models on consumer hardware if you really care about privacy. 8GB+ VRAM isn’t crazy, especially if you have a ton of unified memory on macbooks or some Windows laptops releasing this year that have 64+GB unified memory. There are also websites re-hosting various versions of Deepseek like Huggingface hosting the 32B model which is good enough for most people.

        Instead, the article is written like there is literally no way to use Deepseek privately, which is literally wrong.

        • @superglue@lemmy.dbzer0.com
          link
          fedilink
          English
          24 months ago

          So I’ve been interested in running one locally but honestly I’m pretty confused what model I should be using. I have a laptop with a 3070 mobile in it. What model should I be going after?

        • @ReversalHatchery@beehaw.org
          link
          fedilink
          English
          04 months ago

          as I said in my original comment, it’s not only VRAM that matters.

          I honestly doubt that even gamer laptops can run these models with a usable speed, but even if we add up the people who have such a laptop, and those who have a PC powerful enough to run these models, they are tiny fractions of those that use the internet on the world. it is basically not available to those that want to use it. ot is available to some of them, but not nearly all who may want it

          • @thingsiplay@beehaw.org
            link
            fedilink
            34 months ago

            Thanks for confirmation. I made a top level comment too, because this important information gets lost in the comment hierarchy here.

            • @Hotzilla@sopuli.xyz
              link
              fedilink
              2
              edit-2
              4 months ago

              Open source is in general wrong term in all of these “open source” LLM’s (like LLAMA and R1), the model is shared, but there is no real way of reproducing the model. This is because the training data is never shared.

              In my mind open source means that you can reproduce the same binary from source. The models are shared for free, but not “open”.

      • @lily33@lemm.ee
        link
        fedilink
        24 months ago

        There are already other providers like Deepinfra offering DeepSeek. So while the the average person (like me) couldn’t run it themselves, they do have alternative options.

        • @ReversalHatchery@beehaw.org
          link
          fedilink
          English
          14 months ago

          which probably also collects and keeps everything you say in the chat. just look in ublock origin’s expanded view to see their approach to privacy, by having a look at all the shit they are pushing to your browser

      • v_krishna
        link
        fedilink
        English
        24 months ago

        Obviously you need lots of GPUs to run large deep learning models. I don’t see how that’s a fault of the developers and researchers, it’s just a fact of this technology.

      • azron
        link
        fedilink
        1
        edit-2
        4 months ago

        Down votes be damned, you are right to call out the parent they clearly dont articulate their point in a way that confirms they actually understand what is going on and how an open source model can still have privacy implications if the masses use the company’s hosted version.