I don’t consider myself very technical. I’ve never taken a computer science course and don’t know python. I’ve learned some things like Linux, the command line, docker and networking/pfSense because I value my privacy. My point is that anyone can do this, even if you aren’t technical.

I tried both LM Studio and Ollama. I prefer Ollama. Then you download models and use them to have your own private, personal GPT. I access it both on my local machine through the command line but I also installed Open WebUI in a docker container so I can access it on any device on my local network (I don’t expose services to the internet).

Having a private ai/gpt is pretty cool. You can download and test new models. And it is private. Yes, there are ethical concerns about how the model got the training. I’m not minimizing those concerns. But if you want your own AI/GPT assistant, give it a try. I set it up in a couple of hours, and as I said… I’m not even that technical.

  • @HumanPerson@sh.itjust.works
    link
    fedilink
    English
    99 months ago

    Yeah, I like it too. My only issue is ollama’s lack of intel support. I have been looking at issue 1590 on their GitHub. For now I have a 1050ti in a cardboard box PC with other hardware being 10+ years old and a mixed set of RAM totalling 12G. It also has a 100Mbit nic, so I can’t take advantage of full internet speed when downloading models. The worst part is they can support intel, but haven’t merged the solution because of an issue with the windows intel drivers. Linux is fine but I can 't have it. I wasn’t planning to rant, but I already typed it so… enjoy?

    • @chagall@lemmy.worldOP
      link
      fedilink
      English
      5
      edit-2
      9 months ago

      Yeah, I have an NVDIA GPU and it is magic. The best part is when you are using Ollama, open a second terminal window and enter the command, watch -n 0.5 nvidia-smi and you can see your GPU usage go up and down in real-time as you ask the GPT questions. Pretty cool.

      Hopefully they get the ARC folks up and running soon.

  • @Goodtoknow@lemmy.ca
    link
    fedilink
    English
    79 months ago

    Have you found much practical use for small models yet? I love the idea that even the 1.1B tinyllama model can run on my phone, but haven’t found much real world use for it yet. Llama3 8b feels better, but not much better for even emails as it’s a bit dumb

    • @chagall@lemmy.worldOP
      link
      fedilink
      English
      69 months ago

      I use my phone all the time, but I just use a wireguard VPN to tunnel into my home container of Open WebUI. Then I can interact with my desktop machine using a NVIDIA gpu. I’m currently testing mistral-nemo. It’s pretty great but it gets a bit verbose sometimes.

      • @kureta@lemmy.ml
        link
        fedilink
        English
        109 months ago

        I am also using open webui. Most LLMs are too verbose for me, so I created a model in open-webui with system prompt “Do not repeat the questions. Avoid giving lists as answers. Do not summarize the answer at the end. If asked a follow-up question, respond with only new information, do not repeat previously stated information.” and named it No Nonsense.

        • @chagall@lemmy.worldOP
          link
          fedilink
          English
          39 months ago

          That’s really smart. I just found out about fabric yesterday and it is helping me with things like what you stated. Prompt engineering is a huge thing.

    • @coffee_with_cream@sh.itjust.works
      link
      fedilink
      English
      19 months ago

      Imo it’s worthwhile to just run the biggest model available and rent expensive GPU time. It still amounts to very little overall and you get much better results. Project dependent of course

      • @NotMyOldRedditName@lemmy.world
        link
        fedilink
        English
        59 months ago

        Kinda defeats the purpose of doing it private and local.

        I wouldn’t trust any claims a 3rd party service makes with regards to being private.

      • @31337@sh.itjust.works
        link
        fedilink
        English
        39 months ago

        IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?

        • ffhein
          link
          fedilink
          English
          19 months ago

          Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.

    • @Swedneck@discuss.tchncs.de
      link
      fedilink
      English
      4
      edit-2
      9 months ago

      you hear that said about AI because companies are desperately throwing more and more resources at it to get 0.3% better results, and people are collectively running an insane amount of prompts all the time.

      but on a personal level it’s not really any different from any other computations, people render videos all the time and no one complains about the resource usage from that, because companies aren’t trying to sell bloated video rendering services to gardening businesses.

  • @chasingtheflow@lemmy.world
    link
    fedilink
    English
    19 months ago

    Very cool! You can use something like Tailscale to access your local services remotely without exposing them to the internet.

    • @Appoxo@lemmy.dbzer0.com
      link
      fedilink
      English
      39 months ago

      Very technical vs not can be very subjective.
      It can be a 50 year old sysadmin vs Adam I pulled from the street or a graybeard linux admin vs a beginner sysadmin only in it for thr career instead of the passion (those can be very non-technical but good problem solver folks)

      I know my comparison is flawed