Lemmy.world has been down between 02:00 UTC and 05:45 UTC. This was caused by the database spiking to 100% cpu (all 32 cores/64 threads!) due to inefficient queries been fired to the db very often.

I’ve collected the logs and we’ll be checking how to prevent this. (And what caused this)

  • flubba86@lemmy.world
    link
    fedilink
    arrow-up
    50
    arrow-down
    1
    ·
    3 years ago

    Every Lemmy update:

    “We fixed some performance issues by optimising some queries.”

    Also: “To balance it out, we added some new even more inefficient queries.”

  • Possible_EmuWrangler@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    edit-2
    3 years ago

    Thanks for keeping us updated. FYI i noticed an issue there was an error message saying to check (the matix) and (somewhere else, Lemmy community support?). Both of them pointed to the same URL, but im sure they were meant to point to different places.

    Edit. Happed again and I took notes. Both point to lemmy.ml community support

  • Yoz@lemmy.world
    link
    fedilink
    arrow-up
    4
    arrow-down
    26
    ·
    3 years ago

    People move to smaller instances so that with such outage not everyone is affected. Use fediverse as its supposed to be used.

    • SpaceBar@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      3 years ago

      What’s the name of the server you are running?

      A large instance today will be a small instance in the future. There are hardly any users on lemmy compared to other more established platforms. So if lemmy is to ever handle a lot more users, stress testing the code makes a lot of sense.

      What’s going to happen in the future, do you expect there to be 50,000 servers? That’s unrealistic.

      • Yoz@lemmy.world
        link
        fedilink
        arrow-up
        1
        arrow-down
        21
        ·
        3 years ago

        You’re not taking into account that some people are dumb as fuck. They will sit on one instance and when the instance goes down , they’ll start whining

    • cerevant@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      3
      ·
      3 years ago

      I can’t claim to know what the designers intended, but having users spread across a large numbers of servers is terribly inefficient for how Lemmy works: each server maintains a copy of each community that it’s users are subscribed to, and changes to those communities need to be communicated across each of those instances.

      Given this architecture, it is much more efficient and robust to have users concentrate on what are effectively high performance cacheing servers, and communities spread out on smaller, interest focused instances.

      • ewe@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        4
        ·
        3 years ago

        Yeah, I think this is the way things should move in the future. Have community vs user focuses on servers instead of having the same server get hit with both high community/comment usage and a server with lots of login/audit/user browsing requests. Servers with big communities could focus on stability and perfomance. Servers with users could focus on cool UIs and features for their users.