I had this issue on one of my personal sites. It was a blog I used to write maybe 7-8 years ago. All of a sudden, I see insane traffic spikes in analytics. I thought some article went viral, but realized it was too robotic to be true.
And so I narrowed it down to some developer trying to test their bot/crawler on my site. I tried asking nicely, several times, over several months.
I was so pissed off that I setup a redirect rule for it to send them over to random porn sites. That actually stopped it.
this is the best approach honestly. redirect them to some place that undermines their efforts. either back to themselves, their own provider, or nasty crap that no one want to find in their crawler logs.
I googled a lot of shock sites after seeing them referenced and not knowing what they were. Luckily Google and Wikipedia tended to shield my innocent eyes while explaining what I should be seeing.
The first goatse I actually saw was in ASCII form, funnily enough.
I use the ASCII form to reply to spammers, since it will not trip up on an attachment filter or anything most usually. I get mixed results from them, but the results are usually funny.
I've never seen it in ASCII form, and I don't want to search for it as google will inevitably disregard my instructions and show me the 4K version in full color.
I was so pissed off that I setup a redirect rule for it to send them over to random porn sites. That actually stopped it.