Wordfence is security plugin in WordPress that provides firewall features to the publishing platform. In its basic form, Wordfence allies WordPress protections from foes and “friends,” and in advanced use has amalgamated threat protection and selective international access. I’ve been using this product for a while, and part of what I’m going to reveal is anecdotal--the result of being a volunteer for a busy website that is not only extra large (150GB-plus), but also wasn’t maintained very well for almost a decade. By sheer luck, the website had suffered only minor issues before I started volunteering my time. Luckily, it was never the victim of ransomware.
The WordPress-based website content grew along the lines of the Alice’s Restaurant Massacree posit (when you have a huge spot, you’re not motivated to take out the trash). Daily content published on the site included MP3s, podcasts, and other interesting bits.
This site, like many Wordpress sites, was on a shared host system. Shared systems are great for those who don’t want to maintain the host operating system, or simply don’t care to take responsibility for maintenance. In my example, there were at least 400 other logons on the same host shared by the website, along with lots of "dirt" and junk" on the site itself. The attack surface was huge, and the many people who had worked on the site over the decade it had been online knew the passwords.
Slowly but surely, the site started to slog and stall with too much traffic, huge uploads and downloads, and the incredible weight of thirsty crawlers, spiders and podcast-harvesters. The site would get slow enough to lock out admins, editors and internal updaters. During peak periods, the site would go 503--preventing legitimate public use of the site as crawlers sucked down changes.
Clearly, some maintenance was in order. The breadth of the content of the site couldn’t be shaved or cut down without a lot of work, and finding people to do this for a volunteer-run not-for-profit was difficult. I decided to step in, and used Wordfence to help me mitigate a decade of neglect.
Block and Tackle
The Wordfence security plugin in WordPress has a nearly perfect method of blocking crawlers and bots, and allows administrators to find sites that have been attempting hacks. One Wordfence issue is that it doesn’t allow total blocks, as it exempts Facebook. Although some referrals come from clicked links from within Facebook, Facebook has crawlers. For reasons unknown, Wordfence totally refuses to block them.
Personal experience says that Wordfence will block everything else that an administrator desires in the current release. Google? MSN? BingBot? You can block them by IP address, although these and other crawlers and bots have many IP addresses available, and each one needs to be blocked individually if this is your goal.
Many administrators reading this are currently shrieking--thinking that their business depends on getting ranking inside of Google and MSN/Bing and the crawlers that aid aid in monetizing a site or otherwise attracting users. That's true for a lot of sites, but there are many that don't actively seek to monetize their sites, and don’t want to become part of someone else’s money-making opportunity. This is the case with the site I volunteer for--this NFP isn’t interested in AdWords, doesn’t take money through purchases (only donations) and has no interest in becoming part of an income scheme.
And so, to free up the site, I started blocking crawlers and bots diligently. It took some time--dozens of hours to block them (all except Facebook, of course). Wordfence displays the site visit activity divided by All, Humans, Bots, Crawlers, Google Crawlers and Pages Not Found, and then into Blocked categories. The Blocked categories is important, because accidental blockings are possible as traffic rapidly scrolls by the eyes of the blocker.
Automation of this activity is part of the Premium version of Wordfence, which is not inexpensive. One very helpful component of the Premium edition of the security plugin in WordPress is the ability to geo-fence a site internationally. The website I’m using as an example benefits only a narrow geography. For example, there is no reason why a Ukranian website should be interested in its content, especially when the Ukranian website is attempting to login as administrator to the Wordpress site in a crack attempt.
Along with international geo-fencing is the ability to slow down any crawler or human if they do too much, too quickly. Administrators can slow them down, or decide that if they misbehave over a threshhold of time, to actively and permanently block the IP address of an offender. This feature is a bit tricky because well-intentioned humans sometimes do things very quickly and mistakenly-- resulting in an administrative block that then generates the motivation for a support contact. It also means you can throttle a connection of anything, human or bot, that’s dominating the resources of your website host without actively thinking about it (or, as I did for a while, watching site stress and correlating it to crawler and bot blocking).
And this is my only other problem with Wordfence--that it sometimes has trouble distinguishing between actual fast humans and bots. Some humans surf a website with characteristics that make them appear as though they are bots, but they’re just silly humans making mistakes. Wordfence, in its nervousness, mistakenly classifies these mistakes as bots, but other times very accurately distinguishes bots employed by humans visiting a site from innocuous sources such as Comcast or Verizon.
Amalgamated blocking behavior is perhaps the most interesting feature of Wordfence, in that detected zero-day malware sources are updated to each protected website automatically. This renders an umbrella-like protection to WordPress sites covered by the optional premium protection.
Wordfence comes in two editions: free and premium. Many casual sites will be able to make do with the free edition. The premium edition is suitable for production sites that need to be up and actively blocking bad guys, and those that are rated by their progenitors in uptime importance.