What is a bot? A bot is a piece of software application that runs automated tasks over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much higher rate than would be possible for a human alone.
Good Bots VS. Bad Bots
- Web crawling – Google, Bing, other search engines employ bots to crawl and index the web. They read the sites to check their content for updates and keyword relevancy.
- R2D2, C3PO, RoboCop, Autobots, The Terminator in T2
- Spammers – these bots crawl the web looking for places to leave spam comments.
- Scrappers – bots that crawl the web and look for content to steal and repost
- DDoS – bots that repeatedly download the website to slow down the servers
- Decepticons, Ultron, The Terminator in The Terminator
Recognizing Bot Traffic
You will never notice the good bots. They do not register in analytics and they do not slow down your site. Bad bots, however, are supposed to be filtered out of Google Analytics. Google Analytics is designed to filter out non-human visitors. Some get though and are then filtered out manually by website administrators. You may notice spam from pages with a form without a CAPTCHA (an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”). A CAPTCHA is a type of challenge-response test used in computing to determine whether or not the user is human.
Detecting and Filtering Bot Traffic
To detect bot traffic you need to know how Google Analytics categorizes traffic. Organic Search is people visiting your website that have used a search engine to search a term and then clicking a link on the search results page. This is typically the largest source of traffic. Paid Search is people searching a term and then clicking an ad on the search results page. This is also known as PPC Traffic. Referral Traffic is traffic from a visitor who clicked a link to your site on a non-search page. This may come from a manufacturer site or something like YELP. Direct Traffic is people typing “yoursite.com” directly into their browsers or clicking a bookmark in their browser. Email Traffic comes from clicking a link in an email advertisement. Bot Traffic will always be categorized as Direct Traffic in Google Analytics. Monitoring for a spike in Direct Traffic (as shown below) will show you when your site is being read by a bot.
Identifying Bot Traffic
It is possible that your site suddenly became popular. However, much like high school, it is unlikely. Here are a few characteristics of bot traffic that normal traffic will not have. You can look at these three metrics after a spike in direct traffic to identify bot traffic.
- Time on site will be close to zero.
- Bounce rate will be abnormally low. Bots often hit multiple pages.
- Most bots will show as new sessions. (>90%)
Filtering Bot Traffic
Using filters in Google Analytics you can filter out the bots from your traffic. This ensures accurate reporting on your customers visiting your website. Blocking the bots from loading your website at all would have to be done on the server side and is not recommended.