As some of you may know, my blog content was ripped off recently by web content scrapers but what does that actually mean and why is it bad for your blog?
CONTENT SCRAPING 101Content scraping is when an individual takes content from many other sites, collates it and publishes it to their own site as if it's their work. It's theft of original works and it can negatively affect the SEO of the site it was stolen from. Many sites are made up of works entirely stolen - scraped - from other people. There are a lot of content scrapers out there - this is the second time it's happened to me. Scrapers may steal a few of your articles, or they might copy your whole damn website.
This is what an example of what a scraped website looks like. This was the site of the last person to scrape my content. As you can see it has content from several different people, but nowhere on the site does it LINK to the people whose work it has stolen.
No matter what a scraper says, taking other people's work without permission is theft, is illegal, and it is a copyright infringement.
WHY SCRAPE CONTENT?Content scrapers want to become the authority on their chosen subject so people visit their site instead of going to the many blogs and websites the articles originated from. Why? So they can make money from ad revenue or affiliate marketing. The more eyes on their page the more people they can sell to. When you confront your scraper they will probably act butthurt, like they're doing it out of the goodness of their hearts. My scraper told me they were doing it as a 'free service' and couldn't understand why I was unhappy. Oh wow, they don't want to charge me for stealing my work and passing it off as their own? How kind of them. The cheek of the devil!
HOW DO THEY DO IT?It's done via your blog's RSS feed, and we all have one to syndicate our blogs to Bloglovin' or for email subscribers. There are numerous web scraping tools which enable thieves to steal your content and collate it on their own sites.
HOW CAN IT BE PREVENTED?When you set up an RSS feed for your blog you can set it to send a full or a partial feed to services such as Bloglovin'. I've changed my feed to Bloglovin' to a partial feed - it only sends data up to my 'Read More' section in each blog post, and that makes more trouble than it's worth for scrapers. You can change this in your Blogger or Wordpress settings. You can also alter your Feedburner feed so it only partially syndicates your blog posts for email subscribers. You do this in Optimise - Summary Burner. I've set mine to 1000 characters, which is enough words to describe the post to come.
WHAT CAN YOU DO IF IT HAPPENS TO YOU?Firstly contact the scraper directly and tell them you want them to remove all of your content at once and never take content from your site again. Hopefully that'll be enough to end the matter. This is what I did with my second scraper. It took a couple of emails but they got the hint soon enough. If this doesn't work, file a DCMA (Digital Millennium Copyright Act) takedown notice with Google (Blogger blogs) or Wordpress. This should get the offending page taken down from Google. This is what happened with my first incidence of scraping. A bunch of us got together and reported the site and it was taken down. You can also do a WHOIS lookup and find out who their web host is and report them for copyright infringement, which could get their site closed down or at least hurry along their removal of your work from their site.
HOW DOES IT AFFECT SEO?Google likes original content. If it sees your content duplicated elsewhere it may assume you're a spammer and penalise your site. If you're a blogger, that's bad news. All the hard work you've put into your blog over the years to create original content can be temporarily affected by web scraping thieves. This is why it's so important to get your duplicate content off scraper sites as soon as possible. If you are penalised by Google you may notice a drop in traffic. Here's an article which explains more about Google sanctions and how to have them lifted.
Have you ever been scraped? If it happens to you, feel free to contact me on Twitter for advice. Link below.
Thanks for reading.