Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My worst offender for scraping one of my sites was Anthropic. I deployed an ai tar pit (https://news.ycombinator.com/item?id=42725147) to see what it would do it with it, and Anthropic's crawler kept scraping it for weeks. I calculated the logs and I think I wasted nearly a year of their time in total, because they were crawling in parallel. Other scrapers weren't so persistent.


For me it was OpenAI. GTPBot hammered my honeypot with 0.87 requests per second for about 5 weeks. Other crawlers only made up 2% of the traffic. 1.8 million requests, 4 GiB of traffic. Then it just abruptly stopped for whatever reason.


Tar pits and serve fake but legitimate looking content. Poison it.


That's hilarious. I need to set up one of these myself




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: