An activist group has claimed to have scraped millions of tracks from Spotify and is preparing to release them online.

Observers said the apparent leak could boost AI companies looking for material to develop their technology.

A group called Anna’s Archive said it had scraped 86m music files from Spotify and 256m rows of metadata such as artist and album names. Spotify, which hosts more than 100m tracks, confirmed that the leak did not represent its entire inventory.

The Stockholm-based company, which has more than 700 million users worldwide, said it had “identified and disabled the nefarious user accounts that engaged in unlawful scraping”.

“An investigation into unauthorised access identified that a third party scraped public metadata and used illicit tactics to circumvent DRM [digital rights management] to access some of the platform’s audio files,” said Spotify.

Spotify does not believe the music taken by Anna’s Archive has been released yet. Anna’s Archive, which is known for providing links to pirated books, said in a blog it wanted to create a “‘preservation archive’ for music”.

The group claimed the audio files represented 99.6% of all music listened to by Spotify users and would be shared via “torrents”, a means of sharing large digital files online.

“Of course Spotify doesn’t have all the music in the world, but it’s a great start,” said Anna’s Archive, which describes its mission as “preserving humanity’s knowledge and culture”.

“With your help, humanity’s musical heritage will be forever protected from destruction by natural disasters, wars, budget cuts and other catastrophes,” said the group.

    • ITeeTechMonkey@lemmy.world
      link
      fedilink
      English
      arrow-up
      27
      arrow-down
      2
      ·
      3 months ago

      Ya this is sure the beginning of the end for them. They aren’t an “AI” company so the full force of the government will come after them now that they have been named in a mainstream publication.

      • Truscape@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        16
        ·
        3 months ago

        They’re decentralized, though. Hammer them down and a mirror will pop right up. Clearly they are also willing to work with places that are out of reach of Western Copyright law as well, given their prior interactions with Deepseek’s development.

        • ITeeTechMonkey@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          3 months ago

          TIL they are decentralized and that does make keeping them offline harder, but does make issues like honeypots and malicious mirrors more likely as sites come and go.

    • TrackinDaKraken@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      1
      ·
      3 months ago

      My thoughts, too. Now, there will be a court case, and Anna’s will be shut down. Because, in court, money almost always wins.

      • GissaMittJobb@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        Seems unlikely that they will be successful in shutting it down. If that were the case, they would have been shut down over the books already.

    • Truscape@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      Theoretically a node could be (since Anna’s is decentralized and not consolidated), but in practice I think it’s reasonable to believe none exists. The website’s just accessible by US internet users and hosted somewhere outside the DMCA’s grip.

      • douglasg14b@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        Yeah, but it’s still operated and organized by people, people who of they are within US jurisdiction be punished and made “an example of”. Effectively killing the archive by cutting off its organization.

  • stealth_cookies@lemmy.ca
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 months ago

    My question is “Why?” Pretty much everything on Spotify is already available elsewhere in FLAC format good for archiving rather than Spotify’s bad lossy compression.

    • fruitycoder@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      The funny thing from what I’ve read the got the alot of raw audio files too so the people torrent probably are getting higher quality versions then what Spotify transcodes too

  • Brewchin@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    14
    ·
    3 months ago

    The same Anna’s Archive that allows free anonymous downloads that are throttled to the speed of a 1990-era modem unless you pay?

    Yes, I’m sure preservation and social good is their goal. Definitely not about making money.

    • Phoenixz@lemmy.ca
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      Any idea what it costs to reliably store this data, let alone have the bandwidth to upload it to others?

      This ain’t a cheap game, no matter what the intentions are. I have no problem with paid content because it costs money to have it there. I pay Spotify but I’d rather pay Anna’s archive