søren peter mørch

Conversation

Recent posts in reply to #px274va

movq (www.uninformativ.de)

Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?

Read replies 8 months ago
movq (www.uninformativ.de)

… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …

In reply to: #px274va 8 months ago
movq (www.uninformativ.de)

@prologic Ahhh, I right, now I remember. That ai.txt boils down to this, I guess:

User-Agent: *
Disallow: /
In reply to: #px274va 8 months ago
aelaraji (aelaraji.com)

@movq I have this one as per some article I read some time ago... But just like the robots.txt I don't think you have any grantee that it would be honored, you might even have a better chance hunting for and blocking user-agents.

In reply to: #px274va 8 months ago
movq (www.uninformativ.de)

@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that some of those bots respect it.

In reply to: #px274va 8 months ago
aelaraji (aelaraji.com)

@movq It looks like this one actually reads the robots.txt ... it did a couple of times over the past few weeks.

"GET /robots.txt HTTP/1.1" 304 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"

In reply to: #px274va 8 months ago
aelaraji (aelaraji.com)

Hey @movq !! here's an article you might find interesting: Blocking Bots with Nginx ... this person is actually blocking AI Bots based on a list of User Agents in an interesting way. 👍

In reply to: #px274va 7 months ago
prologic (twtxt.net)

@aelaraji Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent header and respond accordingly.

In reply to: #px274va 7 months ago
prologic (twtxt.net)

Can we trust the bots not to fake their identity? 🤔

In reply to: #px274va 7 months ago
movq (www.uninformativ.de)

@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt / robots.txt, but I wouldn’t trust that they don’t spoof their user agent. 🤔

In reply to: #px274va 7 months ago
prologic (twtxt.net)

@movq me neither 🤦‍♂️

In reply to: #px274va 7 months ago
Reply via email