How to tell. What to do. Costs you nothing.
Pull your access logs and look for this pattern. If 3 or more match, you're being scraped for AI training data. Probably right now.
Your content enters a pipeline. Here's what that pipeline looks like.
Go-http-client/2.0 — same bot, forgot the mask.Most site owners never look. That's what they're counting on. The IP range lists you need to build blocklists, updated daily:
github.com/ipverse/asn-ip