Hamlet Batista – Search Engine Land News On Search Engines, Search Engine Optimization (SEO) & Search Engine Marketing (SEM) Thu, 19 Sep 2019 18:45:13 -0400 en-US hourly 1 https://wordpress.org/?v=5.3 Catching SEO errors during development using automated tests /catching-seo-errors-during-development-using-automated-tests-322365 Thu, 19 Sep 2019 18:45:02 +0000 /?p=322365 Avoid the high cost of catching SEO issues in the production phase by using automated testing techniques during development.

The post Catching SEO errors during development using automated tests appeared first on Search Engine Land.

]]>
Last June I had the pleasure to present at SMX Advanced about one of my favorite topics: improving the collaboration between SEOs and developers.

While my session was about JavaScript for SEO, I took the opportunity to introduce a practice that I think can solve a painful business problem: the high cost of catching SEO issues in production when you can catch them during development using automated testing techniques.

How often do you learn about a noindex meta robots tag in the wrong pages released to production and causing massive SEO traffic drop?

Let’s learn how we can prevent this error and similar ones from happening in the first place.

Automated testing in professional development

Modern professional developers need to add new features or fix bugs at a fast pace and often rely on automated testing to keep their code quality high.

During my session, I mentioned this as a perfect place to catch some SEO errors early, before their damage is too expensive.

In this article, we are going to explore this concept in detail, review some practical examples and outline the responsibilities of the developer and the SEO.

The anatomy of the front end of a modern web application

The front-end of modern web applications is generally built in a modular way using controllers, views, and components.

Controllers route page requests to the correct view of the app and the views are what you see when the page loads.

The views are further broken down into components. For example, in a search page, the grid of search results could be powered by one component.

These components can be rendered on the server-side, on the client-side or on both sides as it is the case of hybrid rendering solutions.

SEO scope

It is important to understand these concepts because not every app controller, view or component requires SEO input or automated tests.

One way to tell is to ask if the component’s functionality should be visible or not to search engine crawlers.

For example, all components or actions behind a login form are not in the scope of SEO because search engine crawlers can’t see them.

The different types of automated tests

Automated testing is a broad topic, but when it comes to SEO concerns, there are two main types of automated tests we need to learn about: unit tests and end-to-end tests.

Developers generally write unit tests to perform individual component and method level checks. The idea is to verify each part of the application works as expected separately and in isolation.

However, while the individual parts can operate correctly, they could fail when put to work together. That is where integration tests (a.k.a. end-to-end tests) come into place. They test that the components can work together too.

We should write both types of tests to check for SEO issues during development.

Let’s review some practical examples.

Writing SEO unit tests

In preparation for my presentation, I coded an AngularJS app that monitors Google Trends topics. I focused on trying to optimize it for basic SEO best practices.

In Angular, we can use Jasmine to write unit tests. Let’s review what unit tests look like and what we can do with them.

As an example, let’s look at the Category Topics component in our app, which is responsible for listing the Google Trends topics for a selected category. 

We added these unit tests to check for basic SEO tags.

The tests above make sure the component sets proper canonical URLs, page titles and meta descriptions. 

You could easily extend this list to include other meta tags like meta robots and hreflang tags.

After you write tests like these, you generally need to execute them after you update the app.

Here is how you run them using Jasmine. In Angular, you type the command: ng test

Here is what the output looks like.

As developers add new features to the website or app and then run the tests, they can get immediate feedback when they forget to add important SEO tags or introduce incorrect ones.

Part of your ongoing work as an SEO is to make sure new relevant components are covered by unit tests.

Writing SEO integration tests

Next, let’s review some of the integration tests that I coded for our app so you can see what they look like.

In Angular, we can use Protractor to run end to end tests.

You might be wondering why we need two different tools to run automated tests?

End-to-end tests run exclusively on a web browser by automating the browser so it performs the scripted actions we specify. This is very different from unit testing where we could run just the specific back end/front end code that we are testing.

If we look at our example app’s category topics page, you can see we added end-to-end tests to check for prerendering issues.

The example tests check that our basic SEO tags work correctly after the page is rendered. This is a test that requires loading the page in the browser and wait for the JavaScript code to execute.

One simple check we added was to make sure the key meta tags like title and meta description didn’t come back null after rendering. Another test would be to check the server-side tags and client-side rendered tags are not different as it could cause cloaking issues.

Here is how you run them using Protractor. In Angular, you type the command: ng e2e

Prerendering JavaScript-based sites can lead to SEO issues that are hard to detect in production. Robust integration tests can provide a strong first line of defense.

Continuous integration

I didn’t cover this topic during my talk, but it is worth mention it. Most development teams that write automated tests, also implement a technique called continuous integration.

Continuous integration allows developers to push their code changes to a code repository and have each commit trigger a suite of automated tests. If the tests pass, the code is packaged for release and deployed automatically. But, if any of the tests fail, the packaging and release pipeline is halted.

Some continuous integration tools like CircleCi require you to add a simple test definitions file to your code repository, add the project to their service and they will run all automated tests, including the deployment pipeline, plus include reporting.

As an SEO practitioner, you could ask your dev team to give you access so you can review SEO tests that fail and review SEO test coverage to recommend any missing tests.

Shared responsibilities

Catching SEO errors during development can save companies a lot of money and headache, and it is a shared responsibility between developers and technical SEOs.

I created these two tables to help define some of the different responsibilities for unit tests and integration tests.

Resources to learn more

I used Angular examples, but automated testing is an established discipline in professional development. You can find equivalent tools and processes in most frameworks.

Here are a few to investigate further for your specific dev stack.

The post Catching SEO errors during development using automated tests appeared first on Search Engine Land.

]]>
PWA: How to avoid partial rendering issues with service workers /pwa-how-to-avoid-partial-rendering-issues-with-service-workers-317631 Fri, 31 May 2019 12:00:13 +0000 /?p=317631 When there are issues rendering pages server side to prevent correct rendering, the content can have discrepancies shown to end users (or search bots).

The post PWA: How to avoid partial rendering issues with service workers appeared first on Search Engine Land.

]]>
In preparation for my upcoming SMX Advanced session about The New Renaissance of JavaScript, I decided to code a progressive web app and try to optimize it for SEO. In particular, I was interested in reviewing all key rendering options (client side, server side, hybrid and dynamic) from a development/implementation perspective.

I learned six interesting insights that I will share during my talk. One of the insights addresses a painful problem that I see happening so often that I thought it was important to share it as soon as possible. So, here we go.

How partial rendering kills SEO performance

When you need to render JavaScript server side, there is a chance that you won’t get the full page content fully rendered. Let’s review a concrete example.

The category view all page from the AngularJs site above hasn’t finished loading all product images after 20 seconds. In my tests, it took about 40 seconds to load fully.

Here is the problem with that. Rendering services won’t wait forever for a page to finish loading. For example, Google’s dynamic rendering service, Rendertron by default won’t wait more than 10 seconds.

View-all pages are generally preferred by both users and search engines when they load fast. But, how do you load a page with over 400 product images fast?

Service workers to the rescue

Before I explain the solution, let’s review service workers and how they are applicable in this context. Detlev Johnson, who will be moderating our panel, wrote a great article on the topic.

When I think about service workers, I think about them as a content delivery network running in your web browser. A CDN helps speed up your site by offloading some of the website functionality to the network. One key functionality is caching, but most modern CDNs can do a lot more than that, like resizing/compressing images, blocking attacks, etc.

A mini-CDN in your browser is similarly powerful. It can intercept and programmatically cache the content from a PWA. One practical use case is that this allows the app to work offline. But what caught my attention was that as service worker operates separate from the main browser thread, it could also be used to offload the processes that slows the page loading (and rendering process) down.

So, here is the idea:

  1. Make an XHR request to get the initial list of products that return fast (for example page 1 in the full set)
  2. Register a service worker that intercepts this request, caches it, passes it through and makes subsequent requests in the background for the rest of the pages in the set. It should cache them all as well.
  3. Once all the results are loaded and cached, notify the page so that it gets updated.

The first time the page is rendered, it won’t get all the results, but it will get them on subsequent ones. Here is some code you can adapt to get this started.

I checked the page to see if they were doing something similar, but sadly they aren’t.

This approach will prevent the typical timeouts and errors from disrupting the page rendering at the cost of maybe some missing content during the initial page load. Subsequent page loads should have the latest information and loaded faster from the browser cache.

I checked Rendertron, to see if this idea would be supported, and I found a pull request merged into their codebase that confirms support for the required feature.

However, as Google removed Googlebot from the list of bots supported in Renderton by default, you need to add it back to get this to work.

Service workers limitations

When working with service workers and moving background work to them, you need to consider some constraints:

  1. Service workers require HTTPS
  2. Service workers intercept requests at the “directory level” they are installed in. For example, /test/ng-sw.js would only intersect requests under /test/* and /ng-sw.js  would intercept requests for the whole site.
  3. The background work shouldn’t require DOM access. Also there is no window, document or parent objects access.

Some example tasks that could run in the background using a service worker are data manipulation or traversal, like sorting or searching — also loading data and data generation.

More potential rendering issues

In a more generalized way, when using Hybrid or server-side rendering (using NodeJs), some of the issues can include:

  1. XHR/Ajax requests timing out.
  2. Server overloaded (memory/CPU).
  3. Third party scripts down.

When using Dynamic rendering (using Chrome), in addition to the issues above, some additional issues can include:

  1. The browser failed to load.
  2. Images take long to download and render.
  3. Longer latency

Bottom line is that when you are rendering pages server side and there are issues preventing full, correct rendering, the rendered content can have important discrepancies with the content shown to end users (or search bots).

There are three potential problems with this: 1) important content not getting indexed 2) accidental cloaking and 3) compliance issues.

We haven’t seen any client affected by accidental cloaking, but it could be a risk. However, we see compliance issues often. One example of compliance issue is the one affecting sites selling on Google Shopping. The information in the product feed needs to match the information on the website. Google uses the same Googlebot for organic search and Google Shopping, so something as simple as missing product images can cause ads to get disapproved.

Additional resources

Please note that this is just one example of the insights I will be sharing during my session. Make sure to stop by so you don’t miss out on the rest.

I found the inspiration for my idea in this article. I also found other useful resources while researching for my presentation that I list below. I hope you find them helpful.

Developing Progressive Web Apps (PWAs) Course
JavaScript Concurrency
The Service Worker Lifecycle
Service Worker Demo

The post PWA: How to avoid partial rendering issues with service workers appeared first on Search Engine Land.

]]>
Brands can better understand users on third-party sites by using a keyword overlap analysis /brands-can-better-understand-users-on-third-party-sites-by-using-a-keyword-overlap-analysis-316157 Tue, 30 Apr 2019 18:17:24 +0000 /?p=316157 These scripts can help analyze cross-site branded traffic with overlapping keywords to capture untapped audiences.

The post Brands can better understand users on third-party sites by using a keyword overlap analysis appeared first on Search Engine Land.

]]>
If you are a manufacturer selling on your own site as well as on retail partners, it is likely you don’t have visibility into who is buying your products or why they buy beyond your own site. More importantly, you probably don’t have enough insights to improve your marketing messaging.

One technique you can use to identify and understand your users buying on third-party websites is to track your brand through organic search. You can then compare the brand searches on your site and the retail partner, see how big the overlap is, how much of the overlapping keywords you rank above the retailer and vice versa. More importantly, you can see if you are appealing to different audiences or competing for the same ones. Armed with these new insights, you could restructure your marketing messaging to unlock new audiences you didn’t tap into before.

In previous articles, I’ve covered several useful data blending examples, but in this one, we will do something different. We will do a deeper dive into just one data blending example and perform what I call a cross-site branded keyword overlap analysis. As you will learn below, this type of analysis will help us understand your users buying on third-party retailer partners.

In the Venn diagram above, you can see an example of visualization we will put together in this article. It represents the number of overlapping keywords in organic search for the brand “Tommy Hilfiger” between their main brand site and Macy’s, a retail partner.

We recently had to perform this analysis for one of our clients and our findings surprised us. We discovered that with 60% of our client’s organic SEO traffic coming from branded searches, as much as 30% of those searches were captured by four retailer partners that also sell their products.

Armed with this evidence and with the knowledge that selling through their retail partners still made business sense, we provided guidance on how to improve their brand searches so they can compete more effectively, and change their messaging to appeal to a different customer than the one that buys from the retailers.

After my team conducted this analysis manually and I saw how valuable it is, I set out to automate the whole process in Python so we could easily reproduce it for all our manufacturing clients. Let me share the code snippets I wrote here and walk you over their use.

Pulling branded organic search keywords

I am using the Semrush API to collect the branded keywords from their service. I created a function to take their response and return a pandas data frame. This function simplifies the process of collecting data for multiple domains.

Here is the code to get organic searches for “Tommy Hilfiger” going to Macy’s.

Here is the code to get organic searches for “Tommy Hilfiger” going to Tommy Hilfiger directly.

Visualizing the branded keyword overlap

After we pull the searches for “Tommy Hilfiger” from both sites, we want to understand the size of the overlap. We accomplish this in the following lines of code:

We can quickly see that the overlap is significant, with 4601 keywords in common, 515 unique to Tommy Hilfiger, and 125 unique to Macy’s.

Here is the code to visualize this overlap as the Venn diagram illustrated above.

Who ranks better for the overlapping keywords?

The most logical next question you would want to ask is that given how significant the overlap is, who commands more higher rankings for those. How can we figure this out? With data blending of course!

First, as we learned in my first data blending article, we will merge the two data frames, and we will use an inner join to keep only the keywords common in the two sets.

When we merge data frames and they have the same columns, they are repeated and the first columns include _x at the end and the second one includes _y. So, Macy’s columns end with _x.

Here is how we create a new data frame with the overlapping branded keywords where Macy’s ranks higher.

Here is the corresponding data frame where Tommy Hilfiger ranks higher.

Here we can see that while the overlap is big, Tommy ranks higher for many more branded keywords than Macy’s (3,173 vs. 1,075). So, is Tommy doing better? Not quite!

As you remember, we also pulled traffic numbers from the API. In the next snippet of code, we will check which keywords are pulling more traffic.

Surprisingly, we see that, while Macy’s performs better for fewer keywords than Tommy Hilfiger,  when we add up the traffic, Macy’s attracts more visitors (75,026 vs. 66,415).

As you can see, sweating the details matters a lot in this type of analysis!

How different are the audiences

Finally, let’s use the branded keywords unique to each site to learn any differences in the audiences that visit each site. We will simply strip the branded phrase from the keywords and create word clouds to understand them better. When we remove the branded phrase “Tommy Hilfiger,” we are left with the additional qualifiers that users use to indicate their intention.

I created a function to create and display the word clouds. Here is the code:

Here is the word cloud with the most popular words left after you remove the phrase “Tommy Hilfiger” from Macy’s keywords.

Here is the corresponding word cloud when you do the same for the Tommy Hilfiger ones.

The main difference I see is people looking for Tommy Hilfiger products in Macy’s have specific products in mind, like boots and curtains, while when it comes to the brand site, people primarily have the outlets in mind. This might be an indicator that they intend to visit the store vs. trying to purchase online. This may also indicate that people going to brand site are bargain hunters while the ones going to Macy’s might not be. These are very interesting and powerful insights!

Given these insights, Tommy Hilfiger could review the SERPS and compare the difference in the messaging between Macy’s and their brand site and adjust it to appeal to their unique audience’s interests.

The post Brands can better understand users on third-party sites by using a keyword overlap analysis appeared first on Search Engine Land.

]]>
5 additional data blending examples for smarter SEO insights /5-additional-data-blending-examples-for-smarter-seo-insights-314645 Wed, 27 Mar 2019 12:39:26 +0000 /?p=314645 Once you preprocess columns to consistent formatting, additional data blending options include prioritizing pages with search clicks, mining internal site search for content gaps, analyzing traffic issues with 404 pages and more.

The post 5 additional data blending examples for smarter SEO insights appeared first on Search Engine Land.

]]>
As I covered in my previous article, data blending can uncover really powerful insights that you would not be able to see otherwise.

When you start shifting your SEO work to be more data-driven, you will naturally look at all the data sources in your hands and might find it challenging to come up with new data blending ideas. Here is a simple shortcut that I often use: I don’t start with the data sources I have (bottoms up), but with the questions I need to answer and then I compile the data I need (top-bottom).

In this article, we will explore 5 additional SEO questions that we can answer with data blending, but before we dive in, I want to address some of the challenges you will face when putting this technique to practice.

Tony McCreath raised a very important frustration you can experience when data blending:

When you join separate datasets, the common columns need to be formatted in the same way for this technique to work. However, this is hardly the case. You often need to preprocess the columns ahead of the join operation.

It is relatively easy to perform advanced data joins in Tableau, Power BI and similar business intelligence tools, but when you need to preprocess the columns is where learning a little bit of Python pays off.

Here are some of the most common preprocessing issues you will often see and how you can address them in Python.

URLs

Absolute or relative. You will often find absolute and relative URLs. For example, Google Analytics URLs are relative, while URLs from SEO spider crawls are absolute. You can convert both to relative or absolute.

Here is how to convert relative URLs to absolute:

Here is how to convert absolute URLs to relative:

Case sensitivity. Most URLs are case sensitive, but If the site is hosted on a Windows Server, you will often find URLs with different capitalization that return the same content. You can convert both to lowercase or upper case.

Here is how to convert them to lowercase:

Here is how to convert them to uppercase:

Encoding. Sometimes the URLs come from the URL parameter of another source URL and if they have query strings they will be URL encoded. When you extract the parameter value, the library you use might or might not do it for you.

Here is how to decode URL-encoded URLs

Parameter handling. If the URLs have more than one URL parameter, you can face some of these issues:

  1. You might have parameters with no values.
  2. You might have redundant/unnecessary parameters.
  3. You might have parameters ordered differently

Here is how we can address each one of these issues.

Dates

Dates can come in many different formats. The main strategy is to parse them from their source format into Python datetime objects. You can optionally manipulate the datetime objects. For example, to sort the dates correctly or to localize to a specific time zone. But, most importantly, you can easily format the datetime dates using a consistent convention.

Here are some examples:

Keywords

Correctly matching keywords across different datasets can also be a challenge. You need to review the columns to see if the keywords appear as users would type them or there has been any normalization.

For example, is not uncommon for users to search by copying and pasting text. This type of keyword searches would include hyphens, quotes, trademark symbols, etc. that would not normally appear when typed. But, when typing, spacing and capitalization might be inconsistent across users.

In order to normalize keywords, you need to at least remove any unnecessary characters and symbols, remove extra spacing and standardize in lower case (or upper case).

Here is how you would do that in Python:

Now that we know how to preprocess columns, let get to the fun part of the article. Let’s review some additional SEO data blending examples:

Error pages with search clicks

You have a massive list of 404 errors that you pulled from your web server logs because Google Search Console doesn’t make it easy to get the full list. Now you need to redirect most of them to recover traffic lost. One approach you could use is to prioritize the pages with search clicks, starting with the most popular ones!

Here is the data you’ll need:

Google Search Console: page, clicks

Web server log: HTTP request, status code = 404

Common columns (for the merge function): left_on: page, right_on: HTTP request.

Pages missing Google Analytics tracking code

Some sites choose to insert tracking codes manually instead of placing them on web page templates. This can lead to traffic underreporting issues due to pages missing tracking codes. You could crawl the site to find such pages, but what if the pages are not linked from within the site? One approach you could use is to compare the pages in Google Analytics and Google Search Console during the same time period. Any pages in the GSC dataset but missing in the GA set can potentially be missing the GA tracking script.

Here is the data you’ll need:

Google Search Console: date, page

Google Analytics: ga:date, ga:landingPagePath, filtered to Google organic searches.

Common columns (for the merge function): left_on: page, right_on: ga:landingPagePath.

Excluding 404 pages from Google Analytics reports

One disadvantage of inserting tracking codes in templates is that Google Analytics page views could trigger when users end up in 404 pages. This is generally not a problem, but it can complicate your life when you are trying to analyze traffic issues and can’t tell which traffic is good and ending in actual page content and which is bad and ending in errors. One approach you could use is to compare pages in Google Analytics with pages crawled from the website that return 200 status code.

Here is the data you’ll need:

Website crawl: URL, status code = 200

Google Analytics: ga:landingPagePath

Common columns (for the merge function): left_on: URL, right_on: ga:landingPagePath

Mining internal site search for content gaps

Let’s say that you review your internal site search reports in Google Analytics and find people coming from organic search and yet performing one or more internal searches until they find their content. It might be the case that there are content pieces missing that could drive those visitors directly from organic search. One approach you could use is to compare your internal search keywords with the keywords from Google Search Console. The two datasets should use the same date range.

Here is the data you’ll need:

Google Analytics: ga:date, ga:searchKeyword, filtered to Google organic search.

Google Search Console: date, keyword

Common columns (for the merge function): left_on: ga:searchKeyword, right_on: keyword

Checking Google Shopping organic search performance

Google announced last month that products listed in Google Shopping feeds can now show up in organic search results. I think it would be useful to check how much traffic you get versus the regular organic listings. If you add additional tracking parameters to the URLs in your feed, you could use Google Search Console data to compare the same products appearing in regular listings vs organic shopping listings.

Here is the data you’ll need:

Google Search Console: date, page, filtered to pages with the shopping tracking parameter

Google Search Console: date, page, filtered to pages without the shopping tracking parameter

Common columns (for the merge function): left_on: page, right_on: page

The post 5 additional data blending examples for smarter SEO insights appeared first on Search Engine Land.

]]>
5 practical data blending examples for smarter SEO insights /5-practical-data-blending-examples-for-smarter-seo-insights-312787 Fri, 22 Feb 2019 13:05:17 +0000 /?p=312787 Here's a step-by-step guide to blending data tables from different tools to uncover valuable new insights using Python (or SQL).

The post 5 practical data blending examples for smarter SEO insights appeared first on Search Engine Land.

]]>
Sometimes we face questions that are hard to answer with the information from isolated tools. One powerful technique we can use is to combine data from different tools to discover valuable new insights.

You can use Google Data Studio to perform data blending, but note that it’s limited to only one type of blending technique: a left outer join (discussed below). I will cover a more comprehensive list of data blending techniques that you can do in Python (or SQL if you prefer it).

Let’s explore some practical SEO applications.

Overall approach

In order to blend separate data tables (think spreadsheets in Excel), you need one or more columns that they need to have in common. For example, we could match the column ga:landingPagePath in a Google Analytics table with the page column in a Google Search Console table.

When we combine data tables this way, we have several options to compute the resulting table.

The Venn diagrams above illustrate standard set theory used to represent the membership of elements in the resulting set. Let’s discuss each example:

Full Outer Join: The elements in the resulting set include the union of all the elements in the source sets. All elements from both sides of the join are included, with joined information if they share a key, and blanks otherwise.

Inner Join: The elements in the resulting set include the intersection of all elements in the source sets. Only elements that share a key on both sides are included.

Left (Outer) Join: The elements in the resulting set include the intersection of all elements in the source sets and the elements only present in the first set. All elements on the left hand side are present, with additional joined information only if a key is shared with the right hand side.

Right (Outer) Join: The elements in the resulting set include the intersection of all elements in the source sets and the elements only present in the second set. All elements on the right hand side are present, with additional joined information only if a key is shared with the left hand side.

I’ll walk through an example of these joins below, but this topic is easier to learn by doing. Feel free to practice with this interactive tutorial.

Here are some practical SEO data blending use cases:

Adding conversion/revenue data to Google Search Console

Google Search Console is my must have tool for technical SEO, but like me, you are probably frustrated that you can’t have revenue or conversion data in the reports. This is relatively easy to fix for landing pages by blending data from Google Analytics.

Both data tables must use the same date range.

First, we’ll set up a Pandas DataFrame with some example Google Analytics data and call it df_a.

Google Analytics data table containing ga:landingPagePath, ga:revenue, ga:transactions (filtered to google organic search traffic)

Next, we’ll set up a DataFrame with some example Search Console data and call it df_b.

Google Search Console data table containing page, impressions, clicks, position

Now, we’ll use the Pandas merge function to combine the two, using first an inner join (the intersection of the two sets), and then using an outer join (the union).

You can see that the outer, left, and right joins contain missing data (“NaN”) when no key is shared by the other side.

You can now divide transactions by clicks to get the conversion rate per landing page, and the revenue per transaction to get the average order value.

Correlating links and domains over time with traffic increase

Are increasing backlinks responsible for an increase in traffic or not? You can export the latest links from Google Search Console (which include the last time Googlebot crawled them), then combine this data table with Google Analytics organic search traffic during the same time frame.

Similar to the first example, both data tables must use the same date range.

Here is the data you’ll need:

Google Search Console: Linking page, Last crawled

Google Analytics: ga:date, ga:newUsers

Common columns (for the merge function): left_on: Last crawled, right_on: ga:date

You can plot traffic and links over time. Optionally, you can add a calculated domain column to the Search Console data table. This will allow you to plot linking domains by traffic.

Correlating new user visits to content length

What is the optimal length of your content articles? Instead of offering rule-of-thumb answers, you can actually calculate this per client. We will combine a data table from your favorite crawler with performance data from Google Analytics or Google Search Console. The idea is to group pages by their word count, and check which groups get the most organic search visits.

Both data tables must use the same set of landing pages.

Screaming Frog crawl: Address, Word count

Google Analytics: ga:landingPagePath, ga:newUsers

Common columns: left_on:Address, right_on: ga:landingPagePath

You need to create Word count bins, group by bins and then plot the traffic per bin.

Narrowing down the pages that lost (or gained) traffic

Why did the traffic drop (or increase)? This is a common and sometimes painful question to answer. We can learn which specific pages lost (or gained) traffic by combining data tables from two separate time periods.

Both data tables must use the same number of days before and after the drop (or increase).

First period in Google Analytics: ga:landingPagePath, ga:newUsers

Second period in Google Analytics: ga:landingPagePath , ga:newUsers

Common columns: left_on:ga:landingPagePath, right_on: ga:landingPagePath

We first need to aggregate new users by page and subtract the first period from the second one. Let’s call this subtraction delta. If delta is greater than zero the page gained traffic, if it is less than zero lost traffic, and if it is zero didn’t change.

Here is an example where we have grouped pages by the page type (Collections, Products, or N/A) and calculated the delta before and after a drop in traffic.

Finding high-converting paid search keywords with poor SEO rankings

Do you have high-converting keywords in Adwords that are ranking poorly in organic search? You can find out by combining Google Adwords data with Google Search Console.

Both data tables must use the same date range.

Google Analytics: ga:adMatchedQuery, ga:transactions (filtered by transactions greater than zero)

Google Search Console: query, position, clicks (filtered by keywords with position greater than 10)

Common columns: left_on: ga:adMatchedQuery, right_on: query

The result will list low ranking organic keywords with transactions, positions and clicks columns.

The post 5 practical data blending examples for smarter SEO insights appeared first on Search Engine Land.

]]>