Forums > Animenano.com Issues and Requests > stopping a feed scraper
<< 1 2 3 >>That only checks for links, so sites like tubeless would still get through (since they actually provide links back).
I agree with Hung's hypothesis that Tubeless is scraping from a secondary RSS source. With no ads, it seems rather pointless, although the constant links are a bit annoying.
The adsense code is still there so they were using it once. I guess the site got banned from Adsense. Good thing is, once you're banned from Adsense, you can never come back in.
You shouldn't be too concerned about your content being stolen. Your blog won't suffer any penalties and tubeless' pages won't ever outrank you in the search engines.
Additionally, the person in charge doesn't seem to be very knowledgable in scraping. There are far better ways to do this.
Secondary RSS source? My guess is that blocking doesn't work because he is using a proxy.
Well, if anyone used Anime Nano's feed to scrape, I can assure you that I'd take care of it ASAP. But I checked all of my server logs (BGB and AN) and that IP shows up in zero.
Maybe if everyone marks trackback from tubeless as being spam, Akismet will start blocking it? I dunno, maybe it's worth a shot.
Looks like tubeless is back after a few days of peace. And is running non-Google ads. Grrr.
For the SK2 users, just add a blacklist entry for that stupid site if you don't want to ban IP's. Works like a charm and blocks by domain name, so there's no need to fear when they change IP's.
Kurogane, SK2 only blocks the trackbacks but can't stop the scrapping. In any case it has stopped sending out the trackbacks.
Checked my access logs and it was scrapping via my xml-rpc feed. Have implemented .htaccess IP banning and so far it looks like it's working.
Zyl, could you provide a sample battle plan so that the rest of us may emulate your fine example?
Kabitzin, I got the IP (EDIT: 71.18.216.51) via the trackbacks it sent earlier and double-checked this against my access logs.
My domain host provides cpanel so I used the IP Deny Manager to blacklist that IP but could also do a manual edit of the .htaccess file as pointed by Kurisu earlier in this thread. The suggestion from Hung (and Kurisu) of redirecting the feed somewhere else is definitely more fun but I don't have the expertise to set up a fake feed...