We should be fighting back. So far I have been using Poison Fountain[1] on many of my websites to feed LLM scrapers with gibberish. The effectiveness is backed by a study from Anthropic that showed that a small batch of bad samples can corrupt whole models[2].
Disclaimer: I'm not affiliated with Poison Fountain or its creators, just found it useful.
Disclaimer: I'm not affiliated with Poison Fountain or its creators, just found it useful.
[1] https://news.ycombinator.com/item?id=46926485
[2] https://www.anthropic.com/research/small-samples-poison