The post How to prepare for a JS migration appeared first on Search Engine Land.]]>
There are many types of migrations, such as changing, merging or splitting the domains, redesigning the website or moving to a new framework.
Firstly, Google uses Chrome 41 for rendering pages. This is a three-year old browser that does not support all the modern features needed for rendering advanced features. Even if they can render JS websites in general, it may happen that some important parts will not be discovered due to the reliance on technology that Google can’t process.
Below you will find information that will help you navigate through the process of changing the current framework. I do not provide “ready-to-go” solutions because your situation will be the result of different factors and there is no universal recipe. However, I want to stress the elements you need to pay particular attention to.
You can’t count on the miracle that Google will understand the change without your help. The whole process of migration should be planned in detail.
I want to keep the focus on JS migration for this article, so if you need detailed migration guidelines, Bastian Grimm has already covered this.
This step should be done before anything else. You need to decide on how Google will receive the content of your website. You have two options:
2. Server-side rendering: This solution relies on rendering by an external mechanism or the additional mechanism/component responsible for the rendering of JS websites, creating a static snapshot and serving it to the search engine crawlers. At the Google I/O conference, Google announced that serving a separate version of your website only to the crawler is fine. This is called Dynamic Rendering, which means that you can detect the crawler’s User Agent and send the server-side rendered version. This option also has its disadvantages: creating and maintaining additional infrastructure, possible delays if a heavy page is rendered on the server or possible issues with caching (Googlebot may receive a not-fresh version of the page).
Before migration, you need to answer if you need option A or B.
If the success of your business is built around fresh content (news, real estate offers, coupons), I can’t imagine relying only on the client-side rendered version. It may result in dramatic delays in indexing so your competitors may gain an advantage.
If you have a small website and the content is not updated very often, you can try to leave it as client-side rendered, but you should test before launching the website if Google really does see the content and navigation. The most useful tools to do so are Fetch as Google in GSC and the Chrome 41 browser.
However, Google officially stated that it’s better to use Dynamic Rendering to make sure they will discover frequently changing content correctly and quickly.
If your choice is to use Dynamic Rendering, it’s time to answer how to serve the content to the crawlers. There is no one universal answer. In general, the solution depends on the technology AND developers AND budget AND your needs.
Below you will find a review of the options you have from a few approaches, but the choice is yours:
Probably I’d go for pre-rendering, for example with prerender.io. It’s an external service that crawls your website, renders your pages and creates static snapshots to serve them if a specific User Agent makes a request. A big advantage of this solution is the fact that you don’t need to create your own infrastructure.
You can schedule recrawling and create fresh snapshots of your pages. However, for bigger and frequently changing websites, it might be difficult to make sure that all the pages are refreshed on time and show the same content both to Googlebot and users.
If you build the website with one of the popular frameworks like React, Vue, or Angular, you can use one of the methods of Server Side Rendering dedicated to a given framework. Here are some popular matches:
Using one of these frameworks installed on the top of React or Vue results in creating a universal application, meaning that the exact same code can be executed both on the server (Server Side rendering) and in the client (Client Side Rendering). It minimizes the issues with a content gap that you could have if you rely on creating snapshots and heavy caching, as with prerender.
It may happen that you are going to use a framework that does not have a ready-to-use solution for building a universal application. In this case, you can go for building your infrastructure for rendering. It means that you can install a headless browser on your server that will render all the subpages of your website and create the snapshots that are served to the search engine crawlers. Google provides a solution for that – Puppeteer is a library that does a similar job as prender.io. However, everything happens on your infrastructure.
The option that you choose will depend on many factors like technology, developers and budgets. In some cases, you may have a few options, but in many cases, you may have many restrictions, so picking a solution will be a single-choice process.
If you decided to use one of the methods of serving the content to the crawler, you will need to have a staging site with this solution installed. Below, I’ll outline the most important elements that should be checked before going live with the website:
1. Content parity
You should always check if users and crawlers are seeing exactly the same content. To do that, you need to switch the user agents in the browser to see the version sent to the crawlers. You should verify the general discrepancies regarding rendering. However, to see the whole picture you will also need to check the DOM (Document Object Model) of your website. Copy the source code from your browser, then change the User Agent to Googlebot and grab the source code as well. Diffchecker will help you to see the differences between the two files. You should especially look for the differences in the content, navigation and metadata.
An extreme situation is when you send an empty HTML file to Googlebot, just as Disqus does.
This is what their SEO Visibility looks like:
They’ve seen better days. Now the homepage is not even indexed.
2. Navigation and hyperlinks
To be 100 percent sure that Google sees, crawls and passes link juice, you should follow the clear recommendation of implementing internal links shared at Google I/O Conference 2018.
If you rely on server-side rendering methods, you need to check if the HTML of a prerendered version of a page contains all the links that you expect. In other words, if it has the same navigation as your client-side rendered version. Otherwise, Google will not see the internal linking between pages. Critical areas where you may have problems is facet navigation, pagination, and the main menu.
While testing the staging site, always check if an SSR version has the canonical tag in the head section. If yes, confirm that the canonical tag is the correct one. A rule of thumb is always sending consistent signals to the search engine whether you use client or server-side rendering.
While checking the website, always verify if both CSR and SSR versions have the same titles, descriptions and robots instructions.
4. Structured data
Structured data helps the search engine to better understand the content of your website.
Before launching the new website make sure that the SSR version of your website displays all the elements that you want to mark with structured data and if the markups are included in the prerendered version. For example, if you want to add markups to the breadcrumbs navigation. In the first step, check if the breadcrumbs are displayed on the SSR version. In the second step, run the test in Rich Results Tester to see if the markups are valid.
5. Lazy loading
My observations show that modern websites love loading images and content (e.g. products) with lazy loading. The additional elements are loaded on a scroll event. Perhaps it might be a nice feature for users, but Googlebot can’t scroll, so as a consequence these items will not be discovered.
Seeing that so many webmasters are having problems with lazy loading in an SEO-friendly way, Google published a guideline for the best practices of lazy loading. If you want to load images on a scroll, make sure you support paginated loading. This means that if you scroll, the URLs should change (e.g., by adding the pagination identifiers: ?page=2, ?page=3, etc.) and most importantly, the URLs are updated with the proper content, for example by using History API.
Do not forget about adding rel=”prev” and rel=”next” markups in the head section to indicate the sequence of the pages.
If you decided to create a snapshot for search engine crawlers, you need to monitor a few additional things.
You must check if the snapshot is an exact copy of the client-side rendered version of your website. You can’t load additional content or links that are not visible to a standard user, because it might be assessed as cloaking. If the process of creating snapshots is not efficient e.g. your pages are very heavy and your server is not that fast, it may result in creating broken snapshots. As a result, you will serve e.g. partially rendered pages to the crawler.
There are some situations when the rendering infrastructure must work at high-speeds, such as Black Friday when you want to update the prices very quickly. You should test the rendering in extreme conditions and see how much time it takes to update a given number of pages.
The last thing is caching. Setting the cache properly is something that will help you to maintain efficiency because many pages might be quickly served directly from the memory. However, if you do not plan the caching correctly, Google may receive stale content.
Monitoring post-migration is a natural step. However, in the case of moving to a JS framework, sometimes there is an additional thing to monitor and optimize.
Moving to a JS framework may affect web performance. In many cases, the payload increases which may result in longer loading times, especially for mobile users. A good practice is monitoring how your users perceive the performance of the website and compare the data before and after migration. To do so you can use Chrome User Experience Report.
It will provide information if the Real User Metrics have changed over time. You should always aim at improving them and loading the website as fast as possible.
The post How to prepare for a JS migration appeared first on Search Engine Land.]]>
The post Why Google Cache lies to you and what to do about it (if anything) appeared first on Search Engine Land.]]>
With so many people clearly afraid that Google isn’t rendering their pages correctly, I thought I’d write about the cache to help readers understand why checking Google Cache is not a reliable method of analyzing how Google sees the page.
I will also provide information on when Google Cache might be useful and what tools you should use to check how Google renders the page.
In most cases, if you go to the Google Cache for your page, you will see the version of your page from when Google last crawled it. But what exactly are you seeing? Google Cache contains the snapshot of the raw HTML that Googlebot received from your server. Then the HTML captured by Google is rendered by your browser.
The idea behind Google storing cached pages is simple: it lets users browse a page when the page is down or in the event of a temporary timeout.
There are a few methods that will allow you to check Google Cache. The choice is yours:
In Search results, click the arrow next to the URL in the search results and pick Google Cache. Google provides even better instructions:
You can also type the address directly in your browser. Use cache:URL and you’ll be redirected to the cache hosted by http://webcache.googleusercontent.com. Additionally, you can use one of the tools that allows for checking multiple URLs at once, such as Google Cache Checker.
Browser plugins are also an option. For example, you can use Web Cache Viewer.
Now, go to a page you want to check. Click anywhere on the page and pick Web Cache Viewer > Google Cache Archive
Now, let’s slice and dice Google Cache. The cache view shows a few elements:
The full version shows a rendered view of the page. Keep in mind that what you see in the rendered view is the page rendered by YOUR browser, not by Google.
How do I know that this view was rendered by the browser installed on my computer rather than Web Rendering Service (WRS) used by Google? Here is a small experiment. If what I see in Google Cache is rendered by Google’s WRS, I would see the same content in the full version that Google captured while re-indexing the page.
Check Google Cache for this page — Online-Stopwatch and compare the date of the last re-indexing and the time and date displayed in the cache.
As you can see, the time and date when the site was re-indexed is different than what’s displayed on the clock. The clock shows when I checked the cache, so it is displaying the content in real time.
If the page was rendered by WRS, the time and date would be frozen and would display the same time as you see in the gray box.
It’s very easy to misinterpret the information presented in Google Cache. We should keep a healthy distance between what we are seeing there and how we use the data from Google Cache.
Now, it’s time to explain why Google Cache doesn’t show how Google “sees” your website.
As shown above, the view source in cache shows the raw HTML served to Googlebot. At the same time, the full version shows the rendered page, as rendered by your browser. These two pieces of information significantly impact how we should interpret what we see in Google Cache.
Let me guess. You probably more or less use the up-to-date version of the browser. You can check it by visiting this page. My browser is Chrome version 69.
Google, for rendering purposes, uses Web Rendering Service based on Chrome 41. Chrome 41 is a three-year-old browser and it doesn’t support all the modern features needed for proper rendering. The gap between these versions is huge, which you can see by simply comparing the supported and unsupported features in caniuse.
So rendering with Chrome 41 and a more up to date browser is incomparable. Even if you can see the correctly rendered version of the page in Google Cache, you can’t be sure that it also works in Chrome 41. And vice versa.
The second reason why you shouldn’t rely on Google Cache while auditing the website is content freshness. Google doesn’t always create a new snapshot while re-indexing the page. It may happen that they use an older version, even though the content may have changed twice since then. As a result, the content in the cache might be stale.
Google does not provide detailed information on how Google Cache works, but they give us hints on how we should interpret the issues discovered in Google Cache. Below you will find a review of the common issues and their causes.
Important note: some of the anomalies observed in the cache are rather harmless, but it doesn’t mean that you should ignore them. If something isn’t working in the expected way, you should still dedicate some attention and perform a deeper investigation.
Possible reason: a resource like CSS or .js has changed.
When you visit a cached version of the page you may see that it has crashed. Some elements might not be rendered properly; some images might be missing; the fonts might differ from what you see on your website.
Google webmaster trends analyst John Mueller says that it happens sometimes, but it’s not something to worry about.
However, to make sure that Google doesn’t see a page that looks like a mess after a big party, I’d rather go into Google Search Console and perform a “fetch and render” function.
Reason: a website was switched to mobile-first indexing.
There was a lot of panic when Google started rolling out mobile-first indexing and it appeared that many websites were displaying 404 error pages in the cache.
It’s hard to explain why this issue occurs, because Google doesn’t provide details, but the Google Webmasters Twitter account clearly states that, although this may happen, the missing cache view won’t affect your rankings.
Note: some have noticed that you can use a workaround to see the correct results. Click in the address bar of the 404 page and then change the site name to something else — like “x.xyz,” for example — and then enter.
Reason: internal duplication
One of the most confusing situations is when you open the cache view and you see a different page than expected.
You make a “site:” query to check the cached version, and the first strange symptom you can see in the search results is the meta title and meta description belonging to a different subpage.
When two pages are too similar to keep them separate in the index, Google may decide to fold the two pages together. If they don’t see significant differences between two pages and can’t understand what differentiates one from the other, they may keep only one version. This seems to be one of Google’s methods for dealing with duplicate pages.
If you want to have these two pages indexed separately, you need to review the content and answer the question: why are they marked as duplicates? In the next step, make sure that the content published in these pages is unique and responds to the users’ intent.
Reasons: external duplication, incorrect canonicalization.
When looking into Google Cache you may sometimes see a page belonging to a different domain. It might be really confusing.
Google conflates one site with another.
During one of the Google Hangouts, John Mueller mentioned a specific situation, when this may happen. Sometimes Google tries to assess the content uniqueness only by looking at the patterns in the URLs (and probably some other signals, but they don’t visit a given page). For example, if two e-commerce sites have almost the same URL structure and they share the same products IDs, Google may fold them together.
Incorrect rel=canonical tag.
Another scenario that leads to the same results is when someone has implemented a rel=canonical tag incorrectly. For example, if a developer accidentally adds a canonical tag pointing to a different domain on a page, it most probably results in the display of a different page in Google Cache view. In this case, you sent the signal to Google that these two pages are identical and they should fold them together.
My personal nightmare happened when I was diagnosing a similar issue. Apparently, before I started working on the website, some pages had an external canonical tag — only for a while, but long enough to be discovered by Google. After that, the canonical disappeared and there was no sign of their presence, but the Cache was still showing the page once cited as canonical.
Solving this mysterious issue was possible after an Inspect URL feature was added in GSC (Thank you, Google!). This allowed me to determine that Google picked an external URL as a canonical version, and it was the same URL as the user had declared. That user, a developer for the site, was in trouble.
International sites with the same content.
The last example of this issue may appear on international sites that use the same content on different domains (TLDs). For example, if you decide to publish the same content on both the German and Austrian versions of your site, Google may have problems with understanding what the relationship between them is. Even hreflang markup may not help, and Google will combine these URLs together.
In this example, take a look in the search results shown in the animated GIF below. The URL belongs to google.fr, but if you go to the cache view, you will see google.ca as the requested URL
Reason: the page is not cached.
You can also see the 404 error page in Google Cache for a page, even if the site hasn’t yet been switched to mobile-first indexing. This may happen because Google doesn’t store a cached view for all the pages they crawl and index. Google has a huge amount of resources at its disposal, but they aren’t unlimited, so they may forego storing everything.
So just because a page is indexed, that doesn’t mean that the snapshot is taken. But if you have a snapshot in Google Cache, that definitely means that the page was indexed.
If you have a JS-based website and you do not render the content in such a way to serve the rendered version to Google (e.g. with prerender or dynamic rendering), you probably will see an empty cache.
But even if you see an empty cache, that doesn’t mean that the content is not indexed. The rule regarding the two waves of indexing (see below) makes it so that whatever you want to load with JS probably will be indexed, but it might be deferred.
Reason: noarchive meta tag is in use.
Using a noarchive meta tag prevents Google from creating snapshots that could be displayed in Google Cache. In most cases, it’s an intentional step. It’s instructing the tools or applications that they shouldn’t store the snapshots of the page.
This might be useful if the page presents sensitive data that shouldn’t be accessible. If you decide to use a noarchive meta tag, it doesn’t impact the rankings, only whether a snapshot is created and kept.
Google Cache shows so much information. But are they actionable? Not always. Yes, I check Google Cache while analyzing websites, but I’m not focused on solving the issues with Google Cache. I treat any problems I find there as symptoms of other issues.
Here is some information that is always valuable to me:
I don’t want to discourage you from checking Google Cache while auditing websites. We can’t ignore the anomalies observed there, because we don’t know the mechanisms behind Google Cache. But we should keep calm.
Rather than panic I would recommend using one of these tools that could provide more actionable data:
You should keep in mind that Google Cache is a feature for users and its ability to create and display snapshots has no impact on ranking. That said, a discrepancy that you see in Google Cache might be a symptom of other issues that may impact the ranking process, so it’s worth double checking.
The post Why Google Cache lies to you and what to do about it (if anything) appeared first on Search Engine Land.]]>
The post How not to get lost in the performance-oriented web appeared first on Search Engine Land.]]>
You probably also know how to do the basics, like image optimization, to help make your site run a little faster.
Since most of the basics have been covered, I want to talk about different ways to approach performance factors and how they can help you organize and prioritize the process of speed improvement on your site.
Numerous case studies show site speed has a significant impact on user behavior. One of the biggest issues a webmaster faces is how to avoid long loading times so visitors don’t leave the site before the page loads.
According to Think with Google, if the page loading time increases from 1 to 5 seconds, the probability a user will leave without interaction increases by 90 percent! This is huge, and it shows that each second matters.
Bounce rate is one of the metrics representing user satisfaction. However, conversions are the Holy Grail of SEO. If you annoy users with a laggy website, they will go to your competitor, and you lose the sale.
In November 2017, Google Engineer Addy Osmani wrote an overview of a Pinterest case study which showed how speeding up the mobile version of the Pinterest site improved all the metrics related to user engagement.
Pinterest rebuilt their mobile app in three months, and the results were spectacular. The old version of their mobile site became interactive for most users in 23 seconds. They cut it to 5.6 seconds. As a result:
Delays on a website can be very expensive. If you want to see how much you stand to lose, check out the Impact Calculator from Think With Google. This tool will help you estimate how improving your site speed could impact revenue.
For many years, Google did a lot to improve how users perceive page speed by backing projects such as Accelerated Mobile Pages (AMP) and Progressive Web Apps. Both aim at delivering websites to the user as quickly and smoothly as possible.
Accelerated Mobile Pages builds web pages with an open-source library of three components that help web pages to load almost instantly:
In 2015, Google introduced Progressive Web Apps (PWA). The concept is built on the idea of creating web apps that act like a native app. A huge advantage of PWA is the fact that they are smaller in comparison to a standard app and their performance is boosted by Service Workers.
In 2010, Google announced that performance had become a ranking factor for desktop searches and started to adjust rankings for slow pages. A more recent and important update happened on July 9, 2018, when Google rolled out the Speed Update for all users. Google acknowledged that speed has been used as a ranking factor for desktop searches; but now, page speed would also be a ranking factor for mobile searches.
Since implementing mobile-first indexing, a desktop web page has become a secondary version in Google’s eyes. The mobile version is recognized as the primary version when it comes to the process of indexing and ranking. Whatever we do in terms of changes to a web page, we need to think about users with smaller (mobile) screens running a less efficient processor and a less stable internet connection.
Another crucial factor Google introduced with the Speed Update was page performance. Google stressed the importance of understanding how page performance affects a user’s web experience.
There are a number of important factors to understand as they relate to the speed update and the role performance now plays in the way a page ranks. According to Google Senior Webmaster Trends Analyst John Mueller:
You may think you’re safe because your pages don’t take minutes to load, and you may be right. But also keep in mind:
Answering this question gets complicated when we take into consideration that users represent a non-homogenous group of people with different devices, different CPUs and different browsing locations. Even their internet speeds and connections are different. Looking at speed from two different perspectives can help us understand what having a “fast” website means.
While analyzing and measuring performance, we should look at it from two different angles:
Google-oriented performance. This is not a ranking factor. This type of performance directly impacts Googlebot’s behavior on your website. While crawling, Googlebot makes several requests to your server in order to receive pages. If it “notices” that the server can’t deal with page overload, Googlebot slows down or stops crawling.
The faster the website (more precisely, the server) is, the more effective crawling can be. So this type of performance relates to the back end (the server) and refers to the server’s “endurance.”
The following charts are what you will see in your Crawl Stats report in Google Search Console when your website slows down or speeds up. In the first one, “Time spent downloading a page,” if the website speeds up…
… the number of requests increases.
This is a clear signal to Google that it can crawl more pages.
Generally, the bigger the structure, the more important these metrics are. If you have a few thousand pages, it shouldn’t be a problem for Google to crawl them all. But if your structure consists of millions of pages, you need to care about these crawling issues.
User-oriented performance. This one is a ranking factor. It’s all about users’ needs for speed and their perception.
From Google: “… these screenshots of a load timeline should help you better visualize where the load metrics fit in the load experience:”
When it comes to page load speeds, you should follow this rule:
On the other hand, if users need a one-time or urgent support, your website must be fast and light on initial load. In this case, here’s another point to keep in mind:
Let’s take a look at some examples of not-so-fast web pages so you know what to avoid.
(Note: The testing environment for both websites was: Dulles, VA – Chrome – Emulated Motorola G (gen 4) – 3GFast – Mobile.)
I ran a performance test with WebPagetest.org on USA Today, a news portal with many users who revisit the website regularly.
The USA Today page loaded in 25.1 seconds. I could have run to the newsstand and back in that time!
My second site is Thumbtack.com, a website that helps users locate services and professionals in different niches. Let’s imagine I have a major water leak and urgently need to find a plumber near me.
The Thumbtack page loads in 7 seconds — which could be a lifetime if I’m standing in six inches of water.
Which page is faster? Before I give you the answer, let’s go through the concept of capturing performance by Google.
Google repeatedly states they want to return web pages that load quickly and meet users’ expectations in their search results. My first thought was: How can Google “see” and measure performance as normal human users do?
Website performance is relative and not a stable value for all users and devices. We can’t say a web page that loads in 2 seconds has the same value for users as it does Google, because the two are very different and both have limitations.
We should keep in mind the limitations Google has which significantly impact performance:
Using cache-control headers for the static resources on a website allows for storing web pages in the browser’s cache. These resources don’t need to be downloaded again if they were not changed. As a result, the user’s browser doesn’t need to make an additional request.
We know this is accurate, since it’s been confirmed by John Mueller:
If your server is located in Germany, Google won’t notice how fast or slowly your server responds to users in Germany.
Is it possible for Google to imitate all the possible combinations of devices and connections? I don’t think so. They probably try to check web pages with different devices, but it’s impossible to check how the page loads on multiple combinations of central processing units (CPUs) or screen size.
Google updates the user-agent string so their renderer understands web pages that use newer technologies. Here is what the current mobile user agent (Googlebot) looks like:
In most cases, Google visits the page on a Nexus 5X, with Android 6.0.1 Marshmallow as an operating system.
What about your users? You can check what types of devices your website is accessed with by looking into the Google Analytics data in the Audience reports.
The problem is that WRS is based on Chrome 41, a version that is three years old.
In technology years, that’s 30 years! As a result, Google doesn’t benefit from all the modern features available to your users. So, if you want to speed up your website with the following features, unlike your users, Google won’t see the change while crawling your website:
If you want to check the details of what is supported by Chrome 41, visit CanIuse.com and search for a given feature.
Looking at these Googlebot limitations, it seems to be unfair to assess performance without the capabilities to discern whether the website is fast or slow. To sort it out, Google incorporated data from the Chrome User Experience Report into the performance analysis.
The Chrome User Experience Report (CrUX) contains the data of real users captured by their Chrome browsers. The user experience data which includes performance metrics are hosted on Google BigQuery, which is an enterprise data warehouse. These metrics are aggregated so you can’t identify which users were satisfied or not; however, you can filter the data to see segments, like timing for users with 2G connections.
You can access the data in Page Speed Insights by checking at a page level. You can also see the aggregated data for the whole site by using a query: origin:domain.com.
You may not see data for a subpage; it means that they probably don’t have enough data about this page.
You can also use Google BigQuery to retrieve data relating to your domain and browse it in a more granular way. Here is a review of all the metrics available in BigQuery.
CuRX is the first step to getting closer to real-user perception of the website’s performance. Even though it includes some metrics that are not the best representation of page load performance, it is still one the best methods to take a look at real users’ metrics and compare your performance with your competitors’.
Now it’s time to answer the question I asked earlier:
Which site loads faster, USAToday.com (25 seconds) or Thumbtack.com (7 seconds)?
Looking at the CRUX data (taken in the ranking process), we see that USA Today is perceived to be fast :
While Thumbtack is assessed as average by users:
The examples show that performance is not a stable value and getting the seconds down is not enough to satisfy users.
Below you will find some tips on how to assess the performance of the website. Each analysis needs a specific background, so you need to be prepared to understand there is no one answer that will satisfy all the possible scenarios.
To make data-driven decisions, you will need to collect performance data for your website. You should use different methods of collecting and analyzing data:
You should collect as much data about your users as possible and remember that page speed is relative.
You may test your site’s performance and receive satisfying results, but people equipped with less powerful devices or browsing your website with slow internet connections will not share your opinion. Keep in mind that geography, devices and connections all play a role.
Unfortunately, performance is something that can give the advantage to your competitors. It is highly probable a user will leave your website for a similar site if your pages load slowly. Your goal should always be to load faster than your competitor.
You may ask, how much faster should I be? Users can determine a number by doing a little benchmarking. Run three of your competitors through a tool like WebPageTest and look at the results. Which site was the fastest, and by how much? For example:
Once you determine a percentage, which is about 25 percent in our example, use it to set goals by improving by that amount.
In order to deliver the best performance experience to users, you need to pick the correct metrics to work on.
You may frustrate users if they have to look at an empty page for several seconds. Something which could help to minimize “the blank page effect” is optimizing the critical rendering path. This means you should focus on delivering all the resources needed for rendering “above the fold” (in the first section of your web page) as quickly as possible.
Three general actions should be taken:
For example, make sure the footer section is not rendered before the header of your page.
However, don’t be blindly focused on delivering a smooth rendering experience. While the rendering process is the first step towards having a fast website, most webmasters put the bulk of their efforts into user interactions.
What if the content in the first fold is delivered but takes 25 additional seconds to become interactive? I am doubtful your visitors would be satisfied.
In the mobile world, page speed plays a role in the way a web page ranks. As SEOs, we need to take a broader approach and focus our efforts not only on optimizing the crawling process and watching for Googlebot timeouts but also providing the best page speed experience to users.
The post How not to get lost in the performance-oriented web appeared first on Search Engine Land.]]>
In the last 20 years, Google’s search engine has changed a lot. If we take a look at technology and web development as a whole, we can see the pace of change is pretty spectacular.
This website from 1998 was informative, but not very attractive or easy to use:
Modern websites not only look much better, but they are equipped with powerful features, such as push notifications, working partially offline and loading in a blink of the eye.
At the very beginning, when the World Wide Web was built with websites made up of only static hypertext markup language (HTML), Google had a simple task to complete:
Make a request to the server → get the static HTML response → index the page
I know this is a super-simple description of the process, but I want to show the differences between processing websites back when and processing websites today.
Google solved the issue by trying to render almost all the pages they visit. So now ,the process looks more or less like this:
Make a request to the server → GET the static HTML response → Send it to the indexer → Render the page →
Index and send the extracted links to Googlebot → Googlebot can crawl the next pages.
A. What’s the scale of the problem?
B. Where is the website built?
Static HTML websites are built on your server. After an initial request from Googlebot (and users, too), it receives a static page in response.
C. What limits does Google have?
Some time ago, Google revealed how it renders websites: Shared web rending services (WRS) are responsible for rendering the pages. Behind them stands a headless browser based on Chrome 41 which was introduced in 2015, so it’s a little out of date. The fact that Google uses a three-year-old browser has a real impact on rendering modern web applications because it doesn’t support all the current features used by modern apps.
Eric Bidelman, an engineer at Google, confirmed that they are aware of the limits Google has with JS. Based on unofficial statements, we can expect that Chrome 41 will be updated to a more recent version at the end of 2018.
To get significant insight into what is supported and not supported, visit Caniuse.com and compare Chrome 41 with the most recent version of Chrome. The list is long:
Timeouts are the next thing that makes JS and SEO a difficult match.
Google needs to reasonably manage its processing resources because of the massive amount of data it needs to process. The World Wide Web consists of over a billion websites, and it’s growing every day. The chart below shows that the median size of the desktop version of the pages increased by almost 100 percent in the last five years. The adequate metric for the mobile version of the website increased by 250 percent!
Google knows SEOs and developers are having problems understanding search behavior, and they are trying to give us a helping hand. Here are some resources from Google you should follow and check to help with any JS issues you may have:
Three years ago, Google announced that it is able to render and understand websites like modern browsers. But if we look at the articles and the comments on rendering JS websites, you will notice they contain many cautionary words like: “probably,” “generally” and “not always.”
This should highlight the fact that while Google is getting better and better in JS execution, it still has a lot of room for improvement.
The source code is what Googlebot sees after entering the page. It’s the raw HTML without JS integration into the code. An important thing to keep in mind is the fact that Googlebot does not render the pages.
The “Inspect Element” shows the document object model. Rendering is done by Web Rendering Service, which is a part of Google’s Indexer. Here are some important points to keep in mind:
However, recently John Mueller said if Google gets stuck during the rendering of pages, a raw HTML might be used for indexing.
Even if you see that a particular URL is indexed, it doesn’t mean the content was discovered by the indexer. I know that it might be confusing, so here’s a small cheat sheet:
Google officially confirmed we can rely on these two methods of checking how Google “sees” the website:
Now, it’s time to analyze the code and the DOM.
In the first step, compare them in terms of indexability, and check if the source code contains:
Then see if they are compliant with the rendered version of the website.
To spot the differences, you can use a tool like Diff Checker, which will compare text differences between two files.
Using Diff Checker, grab the raw hypertext transfer protocol (HTTP) response from the Google Search Console and compare it with the DOM from the tools mentioned in Point 3 above (the Rich Results test and the Mobile-Friendly test).
While looking at the DOM, it’s also worth verifying the elements dependent on events like clicking, scrolling and filling forms.
Going back to those two waves I mentioned earlier, Google admits that metadata is taken into consideration only in the first wave of indexing. If the source code doesn’t contain robots instruction, hreflangs or canonical tags, it might not be discovered by Google.
To check how Google sees the rendered version of your website, go to the Fetch as Google tool in Google Search Console and provide the URL you want to check and click Fetch and Render.
For complex or dynamic websites, it’s not enough to verify if all the elements of the website are in their place.
Google officially says that Chrome 41 is behind the Fetch and Render tool, so it’s best to download and install that exact version of the browser.
I’d like to mention some common and trivial mistakes to avoid:
Be careful while analyzing mega menus. Sometimes they are packed with fancy features which are not always good for SEO. Here is a tip from John Mueller on how to see if the navigation works for Google:
Also be careful with “load more” pagination and infinite scroll. These elements are also tricky. They load additional pieces of content in a smooth way, but it happens after the interaction with the website, which means we won’t find the content in the DOM.
At the Google I/O conference, Tom Greenway mentioned two acceptable solutions for this issue: You can preload these links and hide them via the CSS or you can provide standard hyperlinks to the subsequent pages so the button needs to link to a separate URL with the next content in the sequence.
The next important element is the method of embedding internal links. Googlebot follows only standard hyperlinks, which means you need to see links like these in the code: (without the spacing)
If you see OnClick links instead, they look like this and will not be discovered:
So, while browsing through the source code and the DOM, always check to be sure you are using the proper method on your internal links.
The fundamental rule to get content indexed is to provide clean and unique URLs for each piece of content.
Many times, JS-powered websites use a hashtag in the URL. Google has clearly stated that in most cases, this type of URL won’t be discovered by the crawler.
While analyzing the website, check to see that the structure is not built with URLs like these:
Everything after the # sign in the URL will be trimmed and ignored by Google, so the content won’t be indexed!
Unfortunately, diagnosing problems with timeouts is not easy. If we don’t serve the content fast enough, we can fail to get the content indexed.
How can we spot these problems? We can crawl the website with a tool like Screaming Frog with the delays set to 5 seconds. In rendering mode, you can see if everything is fine with the rendered version.
John Mueller suggests we can check if Google rendered the page on time in the Mobile-friendly test, and if the website works it should be OK for indexing.
While analyzing the website, look to see if the website implements artificial delays like loaders, which forces waiting for content delivery:
There is no reason for setting similar elements; it may have dramatic effects in terms of indexing the content which won’t be discoverable.
You gain nothing if the content is not indexed. It’s the easiest element to check and diagnose and is the most important!
The most useful method of checking indexation is the well-known query:
Site:domain ‘a few lines of the content from your website’
If you search for a bit of content and find it in the search results, that’s great! But if you don’t find it, roll up your sleeves and get to work. You need to find out why it’s not indexed!
If you want to conduct a complex indexation analysis, you need to check the parts of the content from different types of pages available on the domain and from different sections.
Google says there may be issues with loading “lazy” images:
The second option which makes lazy content discoverable to Google is structured data:
Don’t use this article as the only checklist you’ll use for JS websites. While there is a lot of information here, it’s not enough.
This article is meant to be a starting point for deeper analysis. Every website is different, and when you consider the unique frameworks and individual developer creativity, it is impossible to close an audit with merely a checklist.