I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I’m always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn’t turn up anything.)

  • Shadow@lemmy.ca
    link
    fedilink
    English
    arrow-up
    59
    ·
    11 days ago

    I don’t. Of my 120tb, I only care about the 4tb of personal data and I push that to a cloud backup. The rest can just be downloaded again.

    • NekoKoneko@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      11
      ·
      11 days ago

      Do you have logs or software that keeps track of what you need to redownload? A big stress for me with that method is remembering or keeping track of what is lost when I and software can’t even see the filesystem anymore.

      • Sibbo@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        24
        arrow-down
        3
        ·
        11 days ago

        If you can’t remember what you lost, did you really need it to begin with?

        Unless it’s personal memories of course.

        • NekoKoneko@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          11 days ago

          For me, I have a bad memory. I might remember a childhood movie (a nickname I give to special Linux ISOs) that I hadn’t even thought of for 10 years and track down a copy, sometimes excavating obscure sources, and that may be hours of one-time inspiration and work repeated many times over. Having a complete list is a good helper, but a full backup of course is best.

      • BakedCatboy@lemmy.ml
        link
        fedilink
        English
        arrow-up
        11
        ·
        11 days ago

        My *arrstack DBs are part of my backed up portion, so they’ll remember what I have downloaded in my non-backed up portion.

      • Kurotora@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        11 days ago

        In my case, for Linux ISOs, is only needed to login in my usual private trackers and re-download my leeched torrents. For more niche content, like old school TV shows in local language, I would rely in the community. For even more niche content, like tankoubons only available at the time on DD services, I have a specific job but also relying in the same back up provider that I’m using for personal data.

        Also, as it’s important to remind to everyone, you must encrypt your backup no matter where you store it.

      • i_stole_ur_taco@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        11 days ago

        Set up a job to write the file names of everything in your file system to a text file and make sure that text file gets backed up. I did that on my Unraid server for years in lieu of fully backing up the whole array.

      • ShortN0te@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 days ago

        That should be part of the backup configuration. You select in the backup tool of choice what you backup. When you poose your array then you download that stuff again?

    • givesomefucks@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 days ago

      I only care about the 4tb of personal data and I push that to a cloud backup

      I have doubles of the data. Some of 'em. That way I know I have a pristine one in backup. Then I can use it, it gets corrupted, I don’t care.

      Actually, I have triples of the W2s. I have triples, right? If I don’t, the other stuff’s not true.

      See, the W2s the one I have triples of. Oh, no, actually, I also have triples of the kids photos, too. But just those two. And your dad and I are the same age, and I’m rich and I have triples of the W2s and the kids photos.

      Triples makes it safe.

      Triples is best.

      https://www.youtube.com/watch?v=8Inf1Yz_fgk

      • NekoKoneko@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 days ago

        Bob Odenkirk has never steered us wrong, thanks. I downloaded three copies of this from YouTube in case I forget.

    • BakedCatboy@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 days ago

      Same here, ~30TB currently but my personal artifacts portion is only like 2TB, which is very affordable with rsync.net, which conveniently has an alerts setting if more than X kb hasn’t changed in Y days. (I have my Synology set up to spit out daily security reports to meet that amount, so even if I don’t change anything myself I won’t get bugged)

  • dmention7@midwest.social
    link
    fedilink
    English
    arrow-up
    26
    ·
    10 days ago

    Personally I deal with it by prioritizing the data.

    I have about the same total size Unraid NAS as you, but the vast majority is downloaded or ripped media that would be annoying to replace, but not disastrous.

    My personal photos, videos and other documents which are irreplaceable only make up a few TB, which is pretty managable to maintain true local and cloud backups of.

    Not sure if that helps at all in your situation.

    • Burninator05@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 days ago

      I have data that I actually care about in RAIDZ1 array with a hot standby and it is syched to the cloud. The rest (the vast majority) is in a RAIDZ5. If I lose it, I “lose” it. Its recoverable if I decide I want it again.

  • PieMePlenty@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    ·
    10 days ago

    Not all data is equal. I backup things i absolutely can not lose and yolo everything else. My love for this hobby does not extend to buying racks of hard drives.

    • Zetta@mander.xyz
      link
      fedilink
      English
      arrow-up
      3
      ·
      10 days ago

      Same, my unraid server is over 40 tb but I only have ~1.5 tb of critical data, being my immich photos and some files. I have an on site and off site raspberry pi with 4tb nvme SSD for nightly backups

  • Brkdncr@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    11 days ago

    Backup to 2nd nas.

    Important stuff gets backed up to cloud storage. Whatever is cheapest.

    In my case Synology c2 cloud was cheapest.

      • Brkdncr@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 days ago

        It offers some other features like hybrid access to data,If my nas isn’t available I can access it from their cloud. There’s also some identity services.

  • worhui@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    10 days ago

    Lto tape. But I only have 15tb

    It quickly becomes cost effective when you actually need the data to be safe. Far easier to have off site backups. I have never had a problem , but I like to have offline backup. Most of the time my data is static. So I am only backing up projects files ans changes for the most part.

    If you have 40+ tb of dynamic data I can’t help there.

    Edit: I buy used drives that are usually 2 generations old, so I got lto-5 drives when lto 7 was new. The used drives may be less reliable but used drives can be 1/10th the price of the newest ones.

  • irmadlad@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    11 days ago

    I’m not sure if I qualify as a ‘larger local hoster’ but I would go through your 20 TB and decide what really is important enough to backup in case the wheels fall off. Linux ISOs, those can be re-downloaded, although it would take a bit of time. The things that can’t be readily downloaded such as my music collection that I have been accumulating for decades, converted to flac, and meticulously tagged, can’t be re-downloaded. So that is one of my priorities to back up. Pictures, business documents, personal documents, can’t be re-downloaded, so that goes on the ‘must back up’ list…and so on. Just cull out what is and isn’t replaceable. I would bet that once you do that, your 20 TB will be a bit more slim, and you’re not trying to push 20TB up the pipe to a cloud backup.

    I use BackBlaze’s Personal, unlimited tier for $99 USD per year, which is a pretty sweet deal. One thing about Backblaze to remember is that the drives being backed up must be physically connected to the PC doing the backup/uploading. I get around that because I have a hot swap bay on my main PC, but there are other methods and software that will masquerade your NAS or other as a physically connected drive.

      • irmadlad@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 days ago

        There are many ways to skin the cat. Here’s just one:

        This Docker container runs the Backblaze personal backup client via WINE, so that you can back up your files with the separation and portability capabilities of Docker on Linux.

        It runs the Backblaze client and starts a virtual X server and a VNC server with Web GUI, so that you can interact with it.

        https://github.com/JonathanTreffler/backblaze-personal-wine-container

        There are also other apps that will ‘fool’, for a lack of a better word, Backblaze to think a NAS drive is physically connected.

        • WhyJiffie@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 days ago

          better would be something that can just eat a zfs send stream, but I guess for an emergency it’s fine. but I would still want to encrypt everything somehow.

  • ShawiniganHandshake@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    6
    ·
    9 days ago

    For me, I only back up data I can’t replace, which is a small subset of the capacity of my NAS. Personal data like photos, password manager databases, personal documents, etc. get locally encrypted, then synced to a cloud storage provider. I have my encryption keys stored in a location that’s automatically synced to various personal devices and one off-site location maintained by a trusted party. I have the backups and encryption key sync configured to keep n old versions of the files (where the value of n depends on how critical the file is).

    Incremental synchronization really keeps the bandwidth and storage costs down and the amount of data I am backing up makes file level backup a very reasonable option.

    If I wanted to back up everything, I would set up a second system off-site and run backups over a secure tunnel.

  • Treczoks@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 days ago

    As someone who has experienced double failure twice in my lifetime, I seriously recommend doing backups.

    The problem is that the only serious backup solution is another HDD for this size. A robot array for tapes or worm drives is probably out of budget.

  • Daniel Quinn@lemmy.ca
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 days ago

    Honestly, I’d buy 6 external 20tb drives and make 2 copies of your data on it (3 drives each) and then leave them somewhere-safe-but-not-at-home. If you have friends or family able to store them, that’d do, but also a safety deposit box is good.

    If you want to make frequent updates to your backups, you could patch them into a Raspberry Pi and put it on Tailscale, then just rsync changes every regularly. Of course means that wherever youre storing the backup needs room for such a setup.

    I often wonder why there isn’t a sort of collective backup sharing thing going on amongst self hosters. A sort of “I’ll host your backups if you host mine” sort of thing. Better than paying a cloud provider at any rate.

    • Joelk111@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      10 days ago

      That NAS software company Linus (of Linus Tech Tips) funded has a feature for this planned I think.

      An open-source standalone implementation would be dope as hell. Sure, it’d mean you’d need to double your NAS capacity (as you’d have to provide enough storage as you use), but that’s way easier than building a second NAS and storing/maintaining it somewhere else or constantly paying for and managing a cloud backup.

      • WhyJiffie@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 days ago

        such a system would need a strict time limit for restoration after the catastrophe. Otherwise leeching would be too easy.

        • Joelk111@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          10 days ago

          That’s an incredibly good point. Bad actors are the worst. Some ideas:

          • Maybe you’d need to contribute your storage capacity +10% (or more), to account for your and other’s downtime during disasters.
          • A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.
          • Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails). Sure, that’d suck, but it’d be better than loosing your data, and cheaper overall than paying for cloud backups. I’m not sure where that money would go. Maybe distributed to those who didn’t experience a disaster, or maybe to the software project, though that would mean people are profiting from a disaster. Maybe it could go to a charity of your choice or something.

          Definitely a difficult problem to solve. I’m sure people smarter than me have ideas beyond mine.

          • WhyJiffie@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            10 days ago

            A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.

            and also accounting for low bandwidth connections… whats more, some shitty providers even have monthly data caps

            Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails).

            yeah, that would be almost a necessary feature. being able to hold on to the backup when you really can’t restore.

  • 👍Maximum Derek👍@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    3
    ·
    11 days ago

    Like others, I have a 2 tier system.

    About 2TB of my (Synology) NAS is critical files. Those get sent via Hyperbackup to cloud storage on at least a weekly basis, some daily. I have them broken up into multiple tasks with staggered schedules so it never has much to do on any given day.

    The other 16TB I have get sync’d (again with hyperbackup, but not a scheduled backup task) to a 20TB external drive roughly once per quarter. Then that drive lives on the closet of a family member.

  • 𝚝𝚛𝚔@aussie.zone
    link
    fedilink
    English
    arrow-up
    3
    ·
    9 days ago

    I have a 120TB unraid server at home, and a 40TB unraid server at work. Both use 2 x parity disks.

    The critical work stuff backs up to home, and the critical home stuff backs up to work.

    The media is disposable.

    Both servers then back up to Crashplan on separate accounts - work uses the Australian server on a business account, home used the US server on a personal account.

    I figure I should be safe unless Australia and the US are nuked simultaneously… At which point my data integrity is probably not the most pressing issue.

      • 𝚝𝚛𝚔@aussie.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 days ago

        Yeah I guess it probably makes more sense when it’s my business… Maybe not if you’re an employee at some corporate randomly hosting backups of your dog photos.

        • clif@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          8 days ago

          I dunno. At a big company they probably won’t notice an extra TB of storage cost… So long as you’re discrete with the transfers.

  • Konraddo@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    11 days ago

    Similar to most responses, I backup whatever I created myself, not shared by someone or downloaded from somewhere. I care about pictures that I took, documents, financial records, etc, which don’t take up much space at all.

  • OR3X@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 days ago

    So you have 56TB of total storage, but how much of that 56TB is actually used? Take the amount of storage used and add 10-12% to that figure. Now you create a new NAS (preferably off-site) with that amount of storage and that becomes your backup target. Take an initial backup (locally if possible to speed up the process) and then you can use something like rsync to create incremental backups going forward. This is the method I’ve used and so far it has worked out well. I target 10-12% more than the amount of used storage for my backup capacity because my storage use grows reasonably slowly. If your usage grows faster you might want to increase your “buffer” a little more so that you’re not having to constantly add drives to your backup target.

    • NekoKoneko@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 days ago

      Yeah, this is certainly a viable “brute-force”-ish ooption. While I have 56, I’m only using 26 or so. But I’d actually be hesitant to do anything less than a full capacity mirror because I do expect to eventually use this (and more - adding drives to Unraid).

      I’ve balked because of cost and upkeep (maintaining the same capacity, additional chances for drive failure, two separate sites I need physical access to with a high bandwidth connection), so I admit I was hoping I was missing an easier option.

      • OR3X@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        11 days ago

        I mean, if you want a full mirror, rolling your own backup target is going to be the cheapest option even with the current high price of hardware. Other options are cloud storage, or using another media like tape. Cloud storage is of course an on going cost which rules it out for me, not to mention privacy concerns. There are certain “cold storage” options from cloud storage hosts which are considerably cheaper but they have limitations on how the data can be accessed and how often. The tape route is possible but it’s not really viable for home use due to the high upfront cost of the drives. Outside of that, backing up a subset of your storage as others have suggested is the only other option. Creating viable backups without breaking the bank is a challenge as old as computers, unfortunately.