• @7fb2adfb45bafcc01c80@lemmy.world
    link
    fedilink
    English
    -568 months ago

    I just sent a DMCA takedown last week to remove my site. They’ve claimed to follow meta tags and robots.txt since 1998, but no, they had over 1,000,000 of my pages going back that far. They even had the robots.txt configured for them archived from 1998.

    I’m tired of people linking to archived versions of things that I worked hard to create. Sites like Wikipedia were archiving urls and then linking to the archive, effectively removing branding and blocking user engagement.

    Not to mention that I’m losing advertising revenue if someone views the site in an archive. I have fewer problems with archiving if the original site is gone, but to mirror and republish active content with no supported way to prevent it short of legal action is ridiculous. Not to mention that I lose control over what’s done with that content – are they going to let Google train AI on it with their new partnership?

    I’m not a fan. They could easily allow people to block archiving, but they choose not to. They offer a way to circumvent artist or owner control, and I’m surprised that they still exist.

    So… That’s what I think is wrong with them.

    From a security perspective it’s terrible that they were breached. But it is kind of ironic – maybe they can think of it as an archive of their passwords or something.

    • @Duamerthrax@lemmy.world
      link
      fedilink
      English
      358 months ago

      Not to mention that I’m losing advertising revenue if someone views the site in an archive.

      No one is using Internet Archive to bypass ads. Anyone who would think of doing that already has ad blockers on.

        • @Duamerthrax@lemmy.world
          link
          fedilink
          English
          118 months ago

          I completely understood. No one is going to IA as their first stop. They’re only going there if they want to see a history change or if the original site is gone.

            • @ikidd@lemmy.world
              link
              fedilink
              English
              88 months ago

              Because if you’re referencing something specific, why would you take the chance that someone changes that page? Are you going to monitor that from then on and make sure it’s still correct/relevant? No, you take what is effectively a screenshot and link to that.

              You aren’t really thinking about this from any standpoint except your advertising revenue.

              • @7fb2adfb45bafcc01c80@lemmy.world
                link
                fedilink
                English
                -68 months ago

                I’m thinking about it from the perspective of an artist or creator under existing copyright law. You can’t just take someone’s work and republish it.

                It’s not allowed with books, it’s not allowed with music, and it’s not even allowed with public sculpture. If a sculpture shows up in a movie scene, they need the artist’s permission and may have to pay a licensing fee.

                Why should the creation of text on the internet have lesser protections?

                But copyright law is deeply rooted in damages, and if advertising revenue is lost that’s a very real example.

                And I have recourse; I used it. I used current law (DMCA) to remove over 1,000,000 pages because it was my legal right to remove infringing content. If it had been legal, they wouldn’t have had to remove it.

                • @ikidd@lemmy.world
                  link
                  fedilink
                  English
                  28 months ago

                  This conversation makes me want to throw up, as most discussions that revolve around the DMCA usually do. Using rights under the DMCA doesn’t put you in very good company.

                • Richard
                  link
                  fedilink
                  English
                  18 months ago

                  It’s not allowed with books

                  Have you ever heard of the mysterious places called “libraries”? IA does not “republish” anything, it is an archive.

                  • @7fb2adfb45bafcc01c80@lemmy.world
                    link
                    fedilink
                    English
                    08 months ago

                    Technically, each time that it is viewed it is a republication from copyright perspective. It’s a digital copy that is redistributed; the original copy that was made doesn’t go away when someone views it. There’s not just one copy that people pass around like a library book.

    • Red Army Dog Cooper
      link
      fedilink
      English
      118 months ago

      how do you expect an archive to happen if they are not allowed to archive while it is still up. How are you suposed to track changed or see how the world has shifted. This is a very narrow and in my opinion selfish way to view the world

      • @7fb2adfb45bafcc01c80@lemmy.world
        link
        fedilink
        English
        -28 months ago

        how do you expect an archive to happen if they are not allowed to archive while it is still up.

        I don’t want them publishing their archive while it’s up. If they archive but don’t republish while the site exists then there’s less damage.

        I support the concept of archiving and screenshotting. I have my own linkwarden server set up and I use it all the time.

        But I don’t republish anything that I archive because that dilutes the value of the original creator.

        • @zarkanian@sh.itjust.works
          link
          fedilink
          English
          18 months ago

          A couple of good examples are lifehacker.com and lifehack.org. Both sites used to have excellent content. The sites are still up and running, but the first one has turned into a collection of listicles and the second is an ad for an “AI-powered life coach”. All of that old content is gone and is only accessible through the Internet Archive.

          In fact, many domains never shut down, they just change owners or change direction.

          • @7fb2adfb45bafcc01c80@lemmy.world
            link
            fedilink
            English
            0
            edit-2
            8 months ago

            Again, isn’t that the site’s prerogative?

            I think there should at least be a recognized way to opt-out that archive.org actually follows. For years they told people to put

            User-agent: ia_archiver
            Disallow:
            

            in robots.txt, but they still archived content from those sites. They refuse to publish what IP addresses they pull content down from, but that would be a trivial thing to do. They refuse to use a UserAgent that you can filter on.

            If you want to be a library, be open and honest about it. There’s no need to sneak around.

    • @jqubed@lemmy.world
      link
      fedilink
      English
      48 months ago

      About the only thing I can agree with you on here is I don’t like when people on Wikipedia archive a link and then list that as the primary source in the reference instead of the original link. Wikipedia (at least in English) has a proper method to follow for citations with links and the archived version should only become the primary if the original source is dead or has changed and no longer covers the reference.

      They should also honor a DMCA takedown and robots.txt, but at least with the DMCA I’m sure there’s a backlog. Personally I’ve always appreciated the archive’s existence, though, and would think their impact is small enough that it’s better to have them than block them.