I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I’m always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn’t turn up anything.)

  • PieMePlenty@lemmy.world
    link
    fedilink
    English
    arrow-up
    33
    ·
    6 days ago

    Not all data is equal. I backup things i absolutely can not lose and yolo everything else. My love for this hobby does not extend to buying racks of hard drives.

    • Zetta@mander.xyz
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 days ago

      Same, my unraid server is over 40 tb but I only have ~1.5 tb of critical data, being my immich photos and some files. I have an on site and off site raspberry pi with 4tb nvme SSD for nightly backups

  • dmention7@midwest.social
    link
    fedilink
    English
    arrow-up
    36
    ·
    7 days ago

    Personally I deal with it by prioritizing the data.

    I have about the same total size Unraid NAS as you, but the vast majority is downloaded or ripped media that would be annoying to replace, but not disastrous.

    My personal photos, videos and other documents which are irreplaceable only make up a few TB, which is pretty managable to maintain true local and cloud backups of.

    Not sure if that helps at all in your situation.

    • Burninator05@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      I have data that I actually care about in RAIDZ1 array with a hot standby and it is syched to the cloud. The rest (the vast majority) is in a RAIDZ5. If I lose it, I “lose” it. Its recoverable if I decide I want it again.

  • ShawiniganHandshake@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    9
    ·
    6 days ago

    For me, I only back up data I can’t replace, which is a small subset of the capacity of my NAS. Personal data like photos, password manager databases, personal documents, etc. get locally encrypted, then synced to a cloud storage provider. I have my encryption keys stored in a location that’s automatically synced to various personal devices and one off-site location maintained by a trusted party. I have the backups and encryption key sync configured to keep n old versions of the files (where the value of n depends on how critical the file is).

    Incremental synchronization really keeps the bandwidth and storage costs down and the amount of data I am backing up makes file level backup a very reasonable option.

    If I wanted to back up everything, I would set up a second system off-site and run backups over a secure tunnel.

  • INeedMana@piefed.zip
    link
    fedilink
    English
    arrow-up
    13
    ·
    6 days ago

    I’ve been following this post since the first comment.

    And I have just put together my own RAID1 1TB NAS. And I did not think that 1TB will serve me forever, more like “a good start”.

    But the numbers I’ve been seeing in here… you guys are nuts 😆

  • GenderNeutralBro@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    21
    ·
    7 days ago

    You’ll think I’m crazy, and you’re not wrong, but: sneakernet.

    Every time I run the numbers on cloud providers, I’m stuck with one conclusion: shit’s expensive. Way more expensive than the cost of a few hard drives when calculated over the life expectancy of those drives.

    So I use hard drives. I periodically copy everything to external, encrypted drives. Then I put those drives in a safe place off-site.

    On top of that, I run much leaner and more frequent backups of more dynamic and important data. I offload those smaller backups to cloud services. Over the years I’ve picked up a number of lifetime cloud storage subscriptions from not-too-shady companies, mostly from Black Friday sales. I’ve already gotten my money’s worth out of most of them and it doesn’t look like they’re going to fold anytime soon. There are a lot of shady companies out there so you should be skeptical when you see “lifetime” sales, but every now and then a legit deal pops up.

    I will also confess that a lot of my data is not truly backed up at all. If it’s something I could realistically recreate or redownload, I don’t bother spending much of my own time and money backing it up unless it’s, like, really really important to me. Yes, it will be a pain in the ass when shit eventually hits the fan. It’s a calculated risk.

    I am watching this thread with great interest, hoping to be swayed into something more modern and robust.

    • irmadlad@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      7 days ago

      That is old-old-school. It works tho. You have to be a bit scheduled about it, to encompass current and future important data. IIRC AWS created a 100 petabyte drive and a truck to haul it around to basically do the same thiing, just in much larger amounts.

    • MightyLordJason@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      7 days ago

      Sneakernet crew here too. My work offsite backup is in my backpack. Few times per week I do a sync which takes a few minutes and take it home again. (The sync archives old versions of files and the drive is encrypted.)

      We tried several cloud-based solutions and they were all rather expensive or just plain hard to run to completion or both.

  • Seefra 1@lemmy.zip
    link
    fedilink
    English
    arrow-up
    7
    ·
    6 days ago

    Well, first while raid is great, it’s not a replacement for backups. Raid is mostly useful if uptime is imperative, but does not protect against user errors, software errors, fs corruption, ransomware or a power surge killing the entire array.

    Since uptime isn’t an issue on my home nas, instead of parity I simply have cold backups which (supposedly) I plug in from time to time to scrub the filesystems.

    If a online drive dies I can simply restore it from backup and accept the downtime. For my anime I have simply one single backup, but or my most important files I have 2 backups just in case one fails. (Unfornately both onsite)

    On the other hand, for a client of mine’s server where uptime is imperative, in addiction to raid I have 2 automatic daily backups (which ideally one should be offsite but isn’t, at least they are in different floors of the same building).

  • irmadlad@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    7 days ago

    I’m not sure if I qualify as a ‘larger local hoster’ but I would go through your 20 TB and decide what really is important enough to backup in case the wheels fall off. Linux ISOs, those can be re-downloaded, although it would take a bit of time. The things that can’t be readily downloaded such as my music collection that I have been accumulating for decades, converted to flac, and meticulously tagged, can’t be re-downloaded. So that is one of my priorities to back up. Pictures, business documents, personal documents, can’t be re-downloaded, so that goes on the ‘must back up’ list…and so on. Just cull out what is and isn’t replaceable. I would bet that once you do that, your 20 TB will be a bit more slim, and you’re not trying to push 20TB up the pipe to a cloud backup.

    I use BackBlaze’s Personal, unlimited tier for $99 USD per year, which is a pretty sweet deal. One thing about Backblaze to remember is that the drives being backed up must be physically connected to the PC doing the backup/uploading. I get around that because I have a hot swap bay on my main PC, but there are other methods and software that will masquerade your NAS or other as a physically connected drive.

      • irmadlad@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        There are many ways to skin the cat. Here’s just one:

        This Docker container runs the Backblaze personal backup client via WINE, so that you can back up your files with the separation and portability capabilities of Docker on Linux.

        It runs the Backblaze client and starts a virtual X server and a VNC server with Web GUI, so that you can interact with it.

        https://github.com/JonathanTreffler/backblaze-personal-wine-container

        There are also other apps that will ‘fool’, for a lack of a better word, Backblaze to think a NAS drive is physically connected.

        • WhyJiffie@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 days ago

          better would be something that can just eat a zfs send stream, but I guess for an emergency it’s fine. but I would still want to encrypt everything somehow.

    • countstex@feddit.dk
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 days ago

      I use backblaze too, started with the personal back up, but swapped to the B2 solution as it was supported by my NAS. The cost of the actual storage isn’t much, most of the cost is in access, so for data that doesn’t alter much it worked out just as cheap, and easier to do things that way.

      • irmadlad@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 days ago

        and easier to do things that way.

        I’m cheap and my labor is free. LOL But you do have a point.

      • FreedomAdvocate@lemmy.net.au
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        The cost of B2 storage is very high, what are you talking about? USD$6 per terabyte per month would be like $4k a year for me.

  • worhui@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    7 days ago

    Lto tape. But I only have 15tb

    It quickly becomes cost effective when you actually need the data to be safe. Far easier to have off site backups. I have never had a problem , but I like to have offline backup. Most of the time my data is static. So I am only backing up projects files ans changes for the most part.

    If you have 40+ tb of dynamic data I can’t help there.

    Edit: I buy used drives that are usually 2 generations old, so I got lto-5 drives when lto 7 was new. The used drives may be less reliable but used drives can be 1/10th the price of the newest ones.

  • randombullet@programming.dev
    link
    fedilink
    English
    arrow-up
    6
    ·
    6 days ago

    I have 3 main NASes

    78TB (52TB usable) hot storage. ZFS1

    160TB (120TB) warm storage ZFS2

    48TB (24TB) off site. ZFS mirror

    I rsync every day from hot to off site.

    And once a month I turn on my warm storage and sync it.

    Warm and hot storage is at the same location.

    Off site storage is with a family friend who I trust. Data isn’t encrypted aside from in transit. That’s something else I’d like to mess with later.

    Core vital data is sprinkled around different continents with about 10TB. I have 2 nodes in 2 countries for vital data. These are with family.

    I think I have 5 total servers.

    Cost is a lot obviously, but pieced together over several years.

    The world will end before my data gets destroyed.

  • 𝚝𝚛𝚔@aussie.zone
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 days ago

    I have a 120TB unraid server at home, and a 40TB unraid server at work. Both use 2 x parity disks.

    The critical work stuff backs up to home, and the critical home stuff backs up to work.

    The media is disposable.

    Both servers then back up to Crashplan on separate accounts - work uses the Australian server on a business account, home used the US server on a personal account.

    I figure I should be safe unless Australia and the US are nuked simultaneously… At which point my data integrity is probably not the most pressing issue.

      • 𝚝𝚛𝚔@aussie.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 days ago

        Yeah I guess it probably makes more sense when it’s my business… Maybe not if you’re an employee at some corporate randomly hosting backups of your dog photos.

        • clif@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 days ago

          I dunno. At a big company they probably won’t notice an extra TB of storage cost… So long as you’re discrete with the transfers.

  • SayCyberOnceMore@feddit.uk
    link
    fedilink
    English
    arrow-up
    5
    ·
    7 days ago

    What’s your recovery needs?

    It’s ok to take 6 months to backup to a cloud provider, but do you need all your data to be recovered in a short period of time? If so, cloud isn’t the solution, you’d need a duplicate set of drives nearby (but not close enough for the same flood, fire, etc.

    But, if you’re ok waiting for the data to download again (and check the storage provider costs for that specific scenario), then your main factor is how much data changes after that initial 1st upload.

    • NekoKoneko@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 days ago

      Sorry. Shortly after posting this and the initial QA I left for a trip.

      I could definitely wait those time periods for a first backup and a restore, since I assume it’ll be a once in 10 year at worst situation. Data changes after the first upload should be show enough to keep up.

      • SayCyberOnceMore@feddit.uk
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 days ago

        No worries, I don’t have a time limit on responses 😉

        But… I took somethong like ~3 days to get an initial baxkup done.

        Then ~3 years later I was at a different provider doing the same thing.

        What I did do differently was to split the data into different backup pools (ie photos, music, work, etc) rather than 1 monolithic pool… that’ll make a difference.

        • NekoKoneko@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 days ago

          That does make sense - also matches how I have currently sperated files so it’s a valuable idea. Thanks!

  • Daniel Quinn@lemmy.ca
    link
    fedilink
    English
    arrow-up
    5
    ·
    7 days ago

    Honestly, I’d buy 6 external 20tb drives and make 2 copies of your data on it (3 drives each) and then leave them somewhere-safe-but-not-at-home. If you have friends or family able to store them, that’d do, but also a safety deposit box is good.

    If you want to make frequent updates to your backups, you could patch them into a Raspberry Pi and put it on Tailscale, then just rsync changes every regularly. Of course means that wherever youre storing the backup needs room for such a setup.

    I often wonder why there isn’t a sort of collective backup sharing thing going on amongst self hosters. A sort of “I’ll host your backups if you host mine” sort of thing. Better than paying a cloud provider at any rate.

    • Joelk111@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 days ago

      That NAS software company Linus (of Linus Tech Tips) funded has a feature for this planned I think.

      An open-source standalone implementation would be dope as hell. Sure, it’d mean you’d need to double your NAS capacity (as you’d have to provide enough storage as you use), but that’s way easier than building a second NAS and storing/maintaining it somewhere else or constantly paying for and managing a cloud backup.

      • WhyJiffie@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 days ago

        such a system would need a strict time limit for restoration after the catastrophe. Otherwise leeching would be too easy.

        • Joelk111@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          That’s an incredibly good point. Bad actors are the worst. Some ideas:

          • Maybe you’d need to contribute your storage capacity +10% (or more), to account for your and other’s downtime during disasters.
          • A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.
          • Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails). Sure, that’d suck, but it’d be better than loosing your data, and cheaper overall than paying for cloud backups. I’m not sure where that money would go. Maybe distributed to those who didn’t experience a disaster, or maybe to the software project, though that would mean people are profiting from a disaster. Maybe it could go to a charity of your choice or something.

          Definitely a difficult problem to solve. I’m sure people smarter than me have ideas beyond mine.

          • WhyJiffie@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 days ago

            A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.

            and also accounting for low bandwidth connections… whats more, some shitty providers even have monthly data caps

            Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails).

            yeah, that would be almost a necessary feature. being able to hold on to the backup when you really can’t restore.

  • Treczoks@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    7 days ago

    As someone who has experienced double failure twice in my lifetime, I seriously recommend doing backups.

    The problem is that the only serious backup solution is another HDD for this size. A robot array for tapes or worm drives is probably out of budget.