Not to mention they have to store the data after they download it. In theory sto...

luckylion · 2025-10-26T13:38:24 1761485904

If they even are AI crawlers. Could be just as well some exploit-scanners that are searching for endpoints they'd try to exploit. That wouldn't require storing the content, only the links.

m3047 · 2025-10-26T17:59:13 1761501553

If you look at the pages which are hit and how many pages are hit by any one address in a given period of time it's pretty easy to identify features which are reliable proxies for e.g. exploit scanners, trawlers, agents. I publish a feed of what's being hit on my servers, contact me for details (you need to be able to make DNS queries to a particular server directed at a domain which is not reachable from ICANN's root).