OnCrawl was pleased to host Bill Hartzer for a webinar on log file analysis and why it matters for SEO audits on September 25th. He crawled his own website to demonstrate the effect of optimizations on bot activity and crawl frequency.
Introducing Bill Hartzer
Bill Hartzer has over 20 years of expertise as an SEO Consultant and Domain Names Expert. Bill is a world-renowned specialist in his industry, and he was recently profiled on CBS News as one of the country’s top search gurus.
During this hour-long webinar, Bill shows us his log files and discusses how he uses them in the context of a site audit. He discusses the various technologies he used to test site performance and bot behaviour on his site.
Finally, Bill responds to queries about how to use OnCrawl to visualise meaningful findings and offers advice to other SEOs.
How To Use The Cpanel Plugin For WordPress To Read Your Log Files
If you use WordPress and the cPanel plugin, you can access your server logs directly from the WordPress interface.
Go to Metrics and then to Raw Access. You can download daily log files from the file manager, as well as compressed archives of earlier log files, from that location.
Investigate The Contents Of A Log File
A log file is a big text file containing information about all of your website’s visits, including bots. You can easily open it with a simple text editor without paying anything.
Potential bot visits from Googlebot or Bing are easy to recognise because they identify themselves in the log files, but it’s a good idea to validate bot identification using IP lookups.
Other bots that crawl your site but aren’t valuable to you may also be present. You can prevent these bots from visiting your website.
OnCrawl will analyse the raw statistics in your log files to provide you with a detailed picture of the bots that visit your website.
Using Your Log Files To Learn More About Crawl Stats
When compared to information in your log files, information about Crawl Stats is available in the old version of Google Search Console under Crawl > Crawl Stats.
You should be aware that the data displayed in Google Search Console is not limited to Google’s SEO bots and, as a result, may be less relevant than the more precise information obtained by reviewing your log files.
Recent Reports Of Odd Crawling Activity
Bill examines three recent jumps in Google Search Console crawl numbers. These are large occasions that cause greater crawl activity.
The September 7th rise in Google Search Console may appear unrelated to website activities at first. However, a glance at the log analysis in OnCrawl revealed some information:
The examination of log files enables us to see the breakdown of the many bots used by Google to crawl pages. It is evident that the desktop Google bot’s activity had fallen drastically prior to this date, and that this spike—unlike the other, lesser spikes—was nearly completely made of hits on unique, previously indexed pages by the mobile Google bot.
A 50% boost in organic traffic observed by Google Analytics verified that this spike correlates to the site’s Mobile First Indexation in early September, weeks before Google’s notice!
Changes To The Site’s URL Structure
Bill changed his URL structure in mid-August to make it more SEO-friendly.
Google Search Console detected two huge increases immediately following this change, demonstrating that Google detects major site events and utilises them as signals to recrawl the website’s URLs.
When we look at the breakdown of these hits in OnCrawl, we can see that the second spike is less of a spike, but that the high crawl rate of pages on this website persists for several days.
Bill can validate that Google has picked up on the adjustments by analysing variances in crawl activity over the days after his revisions.
Oncrawl Reports And Features That Are Useful For Conducting A Technical Audit
SEO Active Pages and SEO Visits
OnCrawl analyses log file data to provide precise information on SEO Visits, or human visitors who arrive via Google SERP listings.
You can count visits or look at SEO active pages, which are individual pages on a website that receive organic traffic.
One issue to investigate as part of an audit is why some ranking pages don’t receive organic traffic (or, in other words, aren’t SEO active sites).
Fresh Rank metrics from OnCrawl, for example, provide critical information. In this scenario, the average number of days between the time Google first crawls a page and the time the page receives its first SEO visit.
Content promotion tactics and backlink development can help a new page gain traffic faster. Some pages on the site in our audit received a substantially lower Fresh Rank, such as blog pieces that were promoted on social media.
Bot Visits Pages And Resources Based On Their Status Code
These are the kinds of things you should look for during an audit. Redirecting these URLs and eliminating internal links to them can yield immediate results.
Keeping notes on components that should be fixed, such as URLs that return status problems to bots, might be useful during an audit.
Custom Reports In Data Explorer
The OnCrawl Data Explorer provides quick filters to generate reports that may be of interest to you, but you can also generate your own reports depending on the criteria that are of interest to you. For example, you might wish to survey SEO Active Pages that have a high bounce rate and a long load time.
Active Orphan Pages In Data Explorer Reports
OnCrawl can help you find sites with organic, human visitors that don’t necessarily provide value to your site by combining analytics, crawl, and log file data. The benefit of using log file data is that it allows you to discover every page on your site that has been visited, including pages that may not have Google Analytics code on them.
Bill was able to determine SEO organic visits on RSS feed pages, most likely via external connections. These are orphan pages on his website; there is no “parent” page that links to them. These pages add no value to his SEO approach, although they do garner a few visits from organic traffic.
These Are Excellent Places To Begin Optimising
Keyword ranking search analytics
Google Search Console can be used to obtain ranking data. You may check Clicks, Impressions, CTR, and Positions for the previous 90 days directly in the old version of Google Search Console by going to Search Traffic, then Search Analytics.
OnCrawl generates clear reports on how this information connects to the overall site, allowing you to compare the total number of pages on the site, the number of ranking pages, and the amount of clicks received by each page.
Impressions, Clicks, And Ctrs
Site segmentation enables you to confirm which sorts or groups of pages on your site are ranking and on which page of the results at a single glance.
Bill is able to use OnCrawl’s statistic in this audit to identify the types of pages that tend to rank well. These are the kinds of pages he knows he should keep creating in order to attract traffic to the website.
Clicks on ranking pages are significantly connected with ranking position: positions greater than 10 are no longer on the first page of search results, at which time the number of clicks for most keywords drops precipitously.
Segmentation Of A Website
OnCrawl’s segmentation allows you to organise your pages into relevant groups. While there is an automatic segmentation, you can adjust the filters or design your own segmentations from start. You can use the OnCrawl Query Language filters to include or exclude pages from a group based on a variety of criteria.
The segmentation on the website Bill looks at in the webinar is based on the numerous directories on the website.
Structured pages > crawled pages > ranking pages > active pages
The “Pages in structure > crawled > ranked > active” table in the OnCrawl Ranking report can alert you to issues with having your pages ranked and visited.
This graph shows:
Pages in the structure: the number of pages accessible via your website’s various links.
Pages that have been crawled by Google
Pages that have been ranked by Google SERPs
Pages that have gotten organic visits are considered active.
The reasons for the variations between the bars in this graph will be investigated by your audit.
Differences in the number of pages in the structure and the number of pages crawled, on the other hand, may be purposeful, such as if you prohibit Google from crawling specific sites by disallowing robots in the robots.txt file. This is something you should check during your audit.
These webinar’s key takeaways include:
Large changes in the structure of a website can result in significant changes in crawl activity.
Google’s free tools present data that has been aggregated, averaged, or rounded in ways that may appear erroneous.
Log files enable you to see actual bot behaviour as well as genuine visits. They are a great technique for detecting spikes when combined with crawl data and daily monitoring.
Accurate data is required to understand why and what happened, which can only be accomplished through cross-analysis of analytics, crawl, rankings, and-specifically-log file data in a solution.
Table of Contents