I just got two of these. Fully loaded. Disks, sleds, rails.
Fiber cards + 4 onboard NICs and 4 more on another card.
Its a dual proc board with a bunch of ram slots. (I think its Sandybridge procs, DDR3.)
20 HDD bays. These things are (older) beastly storage boxen.

Board Manufacturer: Supermicro
Chassis Part Number: CSE-846BTS-R920BP
Board Part Num: X9DRi-LN4+/X9DR3-LN4+
Product PartNum: SSG-6047R-E1R24N

I got them because they were at a remote colo, and they crashed a bunch of times.
They cost us more downtime than they were worth.
I happened to be in town and made my boss an offer.
He didn’t have to pay for e-waste fees, and I removed his problem for the low, low cost of $0.

So now they are my problem.
I don’t need 200 TB of redundant storage. I’m gonna shop em out and sell em.
No idea if the dual 920 watt psu will blow my apt breakers. Takes a lot of juice to spin 20 hdds.

So far, I’ve hauled them across half the US, up my stairs, and admired them.
I found a youtuber ‘Art of the Server’ with some helpful vids. Watched a bunch.
No real idea what I’m doing next.

I’ve configured them several times in the past. They always died after months of steady service.
Dead disks, etc. Maybe bad controllers?
A fault that intermittent is hard to diagnose, but they are in front of me now.
I can do whatever I need to. These are complicated devices.
My original plan of teardown and rebuild seems unwise now.

I’m interested in any practical feedback.

  • dbtng@eviltoast.orgOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    2 days ago

    Thanks! So you don’t think I’m gonna blow my breakers? Alright, we will see.

    “TrueNAS or ProxMox … triage the issues. … set the drive controllers to HBA mode or flash an HBA firmware to them.”

    • Right. I’ve installed TrueNAS on em a couple times previously. They were running ZFS software raid. So … maybe just use the raid controller instead? Honestly, I’ve not tried that yet.
    • I’ve installed a couple different Supermicro firmware versions to them. Got em up to date with the HTML5 (not Java) remote console. That did not fix the crashes. Supermicro’s driver download services are a bit weird, perhaps I missed something they need.
    • All of my prior troubleshooting has been from 1200 miles away. Yes, I’ll do my best to triage. Spin up an OS, and then one by one, check each drive and bay.

    .
    I’m gonna enjoy working with them, but I have a couple Dell Gen13 (Broadwell) servers already in my lab. My main host, running Proxmox, is a Dell R230 8vcpu 64gb. I run up to 8 VMs there, and its really all I need.
    I never run my Dell R430 80vcpu 180gb. No need for that much juice. I really enjoyed upgrading it to the max, and now I don’t use it. After I finish shopping out these new Supermicro monsters, I’m gonna be happy to sell em off to somebody that wants a big chassis with a bunch of disks.

    • fulg@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      2 days ago

      They were running ZFS software raid. So … maybe just use the raid controller instead?

      It is generally a bad idea to do that nowadays, because it ties you forever to that controller. If it dies you will need to find an exact replacement or accept that the whole array is lost. With software raid you can run any hardware.

      Wendell from Level1Techs is a good reference:

      https://youtube.com/watch?v=l55GfAwa8RI

      https://youtube.com/watch?v=Q_JOtEBFHDs

      BTW: good score, you will have fun with those for sure.

      • dbtng@eviltoast.orgOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        24 hours ago

        Those were a couple really good vids. I’ve never been a storage specialist, but I do manage all the storage for a small MSP, so I’m not ignorant. Like, I know ZFS pretty darn well, and I apparently collect storage servers for fun.
        That Wendell guy tho, he really knows his shit.
        I don’t know that I got any final answers from him, but it left me with a lot to consider.

        Honestly, a good chunk of what he had to say had me questioning my build with my Highpoint SSD7540 PCIe 4.0 x16 / 8x M.2 Ports NVMe card … on a completely different machine, a build I was quite satisfied with until now. (It’s on my gamer/server, my main box.)

        I put a lot of research and performance testing into the Highpoint build. It’s an 8x card supporting Gen 4 NVMe in an (actually) 16 lane slot. I populated 4 bays. Each stick gets 4 lanes, which is great for Gen 4. (I figured some day in the future when NVMe gen4 is dirt cheap, I’ll fill the rest, and each stick will just get 2 lanes.) After some testing, I decided to use the hardware RAID controller on the card. Considering what old Wendell had to say, I suspect that perhaps it should be software raid instead … still, that would mean relying on Windows to run the raid, and I don’t trust Windows. And then there’s the fact that after reviewing all the spec sheets, I’ve realized there’s a lot I don’t know about the card. But the Highpoint smokes, and I mostly just store video games there. So maybe bit-rot isn’t a big deal anyway.

        All very interesting stuff. Thanks.