We seem to have run into some issues with one of the NVMe drives the main Lemmy database runs on. It’s still being investigated, but we might have to take down the site for some hours later today to transfer the database to another SSD raid temporarily.

Let’s see…

Edit: Did some preparation work today to minimize the impact/downtime, but I will do the main work that will require a hopefully short down-time tomorrow. But the NVMe drive in question is probably toast, quite litterally as it seems to have been an overheating issue. The new one will have to have better cooling I guess.

Edit2: decided to wait for some spare-parts to arrive first.

Edit3: got the first spare-part today and the other one should be arriving by mail yesterday… so probably tomorrow (edit: has arrived). Currently a bit busy with other things, but latest on the weekend the new parts will go into the server.

  • poVoq@slrpnk.netOPM
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 months ago

    Looks like the hardware replacement went mostly as expected, just the additional cooler for the still working SSD in the raid pool didn’t fit due to some mainboard plastic part in the way, so it will have to continue with somewhat elevated temperatures. But I removed an old network adapter that was producing a lot of heat and improved the overall airflow in the case, so it should be ok. The new NVMe ssd came with a small cooler and so far the temperature looks much better.

  • poVoq@slrpnk.netOPM
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    So just as a heads up. The replacement parts have arrived and I will probably find some time on the weekend to install them.

    This will mean a down-time of approximately 30 minutes or so if everything goes well.