What is lemmy doing about bot scrapers?

flango@lemmy.eco.br · 14 days ago

What is lemmy doing about bot scrapers?

ℕ𝕖𝕞𝕠@slrpnk.net · 14 days ago

My primary instance, slrpnk.net, has Anubis set up. I’m not quite sure how it works, but it seems to force some kind of delay that is hardly noticeable to human users but times out automatic requests.

Zak@lemmy.world · 14 days ago

If you’re concerned about bots ingesting the content, that’s impossible to prevent in an open federated system.

radix@lemmy.world · 14 days ago

It’s weird that this has become such a controversial opinion. The internet is supposed to be open and available. “Information wants to be free.” It’s the big gatekeepers who want to keep all their precious data locked away in their own hoard behind paywalls and logins.

If some clanker is going to read my words, it’s a very small price to pay for people being able to do the same.

Krudler@lemmy.world · 13 days ago

I’m not entirely sure that’s what the concern is, I think it’s that the writer is describing such an obscene influx of bot traffic that it’s must be a nightmare to maintain and pay for?

irelephant [he/him]@lemmy.dbzer0.com · 14 days ago

With activitypub, all the posts are easy to scrape (just add an extra header: Accept: application/activity+json), but most scrapers won’t bother to do that, and scrape the frontend of instances instead.

A lot of instances have deployed Anubis or cloud flare to block scrapers. My instance has iocaine set up iirc.

鳳凰院凶真 (Hououin Kyouma)@sh.itjust.works · 14 days ago

You can do a Sxan Maneuver and add thorns into your "th"s.

Like þis.

(Okay maybe don’t actually do it, Lemmy is gonna downvote you lol)

IsoKiero@sopuli.xyz · 14 days ago

English is not my native language and for whatever reason that makes text almost unreadable. But no worries, I can feed that to copilot to clean up:

Can you replace those strange characters to normal from this text: Beautiful! I had þis vinyl, once. Lost wiþ so many þings over þe course of a life.

Absolutely! Here’s your cleaned-up version with the unusual characters replaced by their standard English equivalents:

“Beautiful! I had this vinyl, once. Lost with so many things over the course of a life.”

Let me know if you’d like it stylized or rewritten in a different tone—poetic, nostalgic, modern, anything you like.

irelephant [he/him]@lemmy.dbzer0.com · 14 days ago

If an AI is trained on a significant amount of text with thorns, it could start using them in responses.

Phoenixz@lemmy.ca · 13 days ago

Scrapers like these usually use proxy providers like storm proxies to be able to appear to come from hundreds of thousands of different IP addresses, making it enormously difficult to block them

pewgar_seemsimandroid@lemmy.blahaj.zone · 14 days ago

@supakaity@lemmy.blahaj.zone

What is lemmy doing about bot scrapers?

What is lemmy doing about bot scrapers?

The great scrape

Aggressive bots ruined my weekend