Chapter 5
In This Chapter
Behind every web page viewed in a browser is a host of technologies and services known as the backend that work to make the star performers look good. Just as a Hollywood blockbuster has a crew of people supporting the actors, your website has servers, code, shopping carts, and, most important, your content management system, which all must perform at their best to turn out a superior experience for your customers. A web content management system (CMS) is a software program that helps simplify website creation. A CMS uses a database (such as your database of products, if you have a store) and publishes web pages in an orderly, consistent fashion. One thing a CMS can do is pull information from your database and build pages dynamically, which means the pages don’t actually exist until someone asks for them. If you have 10,000 products, you don't want to build 10,000 individual pages by hand. Instead, you can use a CMS to build them dynamically on the fly.
In this chapter, you discover some common problems that may occur when using a CMS to build your website. For all their advantages, content management systems demand some attention to get in line with your search engine optimization (SEO) efforts. You also can discover some technical solutions that can help you overcome these CMS issues, such as rewriting URLs to have names that are more search engine–friendly. We also give you tips for picking a good CMS and how to modify its settings to work better for SEO. For those of you who use a hosted e-commerce solution for your website, like the Yahoo Store, we tackle how to optimize those product pages. Last, we warn you about the JavaScript website framework that is growing in popularity but risky for search indexing and rankings — along with what you can do to improve a JavaScript site’s SEO friendliness.
Content management systems can be a website owner’s best friend. A CMS gets a website operational fast and keeps it running smoothly. It can manage data, image files, audio files, documents, and other types of content, and it puts them together into web pages. A CMS creates the pages based on templates, which are standard layouts that you design, so that your website has a consistent and cohesive look. Large sites that manage thousands of items use a CMS because it keeps everything organized and systematic. Small-site owners benefit because if they use a CMS, they don’t even have to know HTML (HyperText Markup Language, the predominant markup language used on the web): The CMS can do the technical work for them.
Similarly, you don’t want to run a CMS on autopilot. In order to optimize your website for the search engines, you must be able to customize your pages down to the smallest detail, and catch and manage any machine-generated issues that work against your SEO efforts.
If you have a store with several thousand products for sale, you don't want to create a page for each item by hand. Instead, you’re going to use a CMS to assemble web pages with product descriptions, pictures, prices, and other content pulled directly out of your product database. These dynamic pages look unique to the end user, but behind the scenes, they’re usually not.
CMSs may create all kinds of duplicate content problems. If not looked after, they may build nontargeted content, or generic text that isn’t customized for your various subject themes and keywords. You want to make sure each and every one of your web pages has unique text for all parts of your web pages, including
Title tags: The Title tag is part of the HTML code behind each web page, and the search engines pay a lot of attention to it. The Title tag usually gets displayed as the bold heading in a SERP result, so it should specifically contain that page’s keywords.
If you don’t specify otherwise, CMSs often put the same Title tag on every page. It might be the company name, the domain name (the root part of the website URL, such as wiley.com), or the company name plus a few keywords applied as one-size-fits-all.
Headings: Your H# heading tags are HTML-style codes applied to your page’s headings and subheadings to make them stand out. The search engines look at these heading tags as clues to what a page’s main points are. They need to be keyword-rich and unique.
A CMS may create heading tags that are generic (such as Features Overview or More Details) rather than specific and full of your targeted keywords.
Content management systems create pages that search engines may consider duplicates in another way, and that’s through dynamic URLs (the web addresses of pages, usually starting with http:// or https://).
There are ways to set up a CMS to build the URL string dynamically for every page request. Dynamic URLs created by a CMS often contain variables (characters that vary). When variables are added to the end of a URL, it forms a new URL. If the search engines think each URL is a distinct page, duplicate content issues may arise when the same content shows up under many different URLs.
Here are two common types of variables that CMSs often add to URLs, but there are many others:
www.shoe-site.com/pumps.asp?color=red&brand=myers
www.shoe-site.com/pumps.asp?brand=myers&color=red
There are many good reasons not to like dynamic URLs:
Now that we’ve made a case against the use of dynamic URLs, we want to explain how you can compensate for them on your website.
One way to address the search engine indexing issues that result from parameters in dynamic URLs is to tell search engines any and all parameters to ignore. The major search engines have made it possible for you to tell them when parameters caused by dynamically loaded pages should be ignored for crawling and indexing purposes. This trick works in Google, Bing, and Yahoo by way of Bing. (As of 2009, Bing and Yahoo share the same crawlers and index.)
Know that the search engines already try to detect duplicate content caused by URL parameters. They group detected duplicate pages into a cluster and do some fancy detective work to pick the best URL to represent the cluster in search results. They also consolidate ranking and relevance signals to that page. If your parameters are kept to a minimum, the search engine is likely to pick the right version of the URL as the canonical, or preferred, version of a page. But the search engines are just machines, and they can get things wrong. It’s much better for you, the wise human webmaster, to tell the robots what URL is the canonical version of a page and which dynamically generated URLs are none of their business.
Bing’s Ignore URL Parameters tool is in the Configure My Site section of the free Bing Webmaster Tools (http://www.bing.com/toolbox/ webmaster). From here, you can submit the exact parameters that are causing multiple variations of a URL pointing to the same page. Here you also find parameters that Bingbot has come across while crawling your site and that it thinks may be safe to ignore. Just click the suggested parameter to add it to your list of URL parameters to ignore.
Google’s URL Parameters tool is also free to use as part of Google Search Console (https://www.google.com/webmasters/tools/crawl-url-parameters). From your site dashboard, go to Crawl and then URL Parameters in the navigation. From here, click Add parameter or select from the parameters listed. Google asks for a little more information here than Bing does. The next step is to indicate whether that parameter is active or passive. An active parameter changes the actual content displayed on the page, for example, by narrowing the page content the way a pagination parameter does. A passive parameter, which is commonly a session ID or affiliate ID, doesn’t impact the page. If the parameter you’re adding is passive, you click Save. If you’re adding an active parameter, next pick an option for how you want Google to treat the parameter. These are your options:
Your last step in using the Google URL Parameters tool is to tell Google which version of a page with parameters is the canonical, your preferred version for serving in results and consolidating ranking signals. (Note: If the canonical page you specify has significantly different content than a URL with parameters, Google reserves the right to ignore your indications and keep the page indexed. Oh, that know-it-all Google.)
Another option you have if your CMS insists on building URLs that are long and ugly, is to go over its head and rewrite the URLs at the web server layer. (The web server is the software application that runs your website, which receives each user request and serves back the requested pages to the user’s browser.) An advantage of this solution is that the search engines will never see the URLs with parameters and have the chance to crawl and index them (provided that redirects are in place and your URLs with parameters aren’t linked to anywhere).
At the server layer (the viewable layer, or how the URL appears to the user and to search engines), you can rewrite complex URLs as clean, concise, static-looking URLs. Rewriting doesn’t change the name of a physical file on your web server or create new directories that don’t physically exist. But rewriting changes the page’s URL on the server layer and appears on the presentation layer. So, for example, if you have a shoe website and your CMS spits out product pages that have long, parameter-laden URLs, like this:
http://www.shoe-site.com/product.cfm?product_id=1234&line=womens&style=pumps&color=navyblue&size=7
you could rewrite them to something simpler, like this:
http://www.shoe-site.com/womens/pumps/productname.cfm
Notice how much more readable the rewritten URL is. This directory structure shown is just an example, but it illustrates how you can potentially have the domain name, directories, and the filename give information about the web page. In this case, not only have you gotten rid of the ugly query string, but also the directories “women's” and “pumps” are short, understandable labels. People seeing this URL have a good idea what the web page contains before they even click to view it. Presenting a concise, informative URL like this to search engines can increase your web page’s ranking — you’ve basically got the makings of a keyword phrase right in the URL. Additionally, presenting this type of short, readable URL to users can also make them more likely to click to your page from a SERP, which increases traffic to your site.
The process of rewriting a URL is often called a mod_rewrite, which stands for module rewrite because that’s what it was originally called on the Apache server. Today, that term is used generally to refer to any URL rewrite, regardless of which server brand is involved. You need someone who’s trained to work with your server software to create mod_rewrites. If you're determined to try it out yourself, we list a few websites that you can look at for reference, based on your server:
http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
).http://nginx.org/en/docs/http/ngx_http_rewrite_module.html
).www.isapirewrite.com
). From the same site, you can access extensive documentation that includes a lot of examples.http://www.iis.net/download/URLRewrite
).Here are some other solutions for dynamic URLs:
Despite the extra maintenance a content management system may require to keep your site SEO-friendly, many websites simply can’t do without one. For large stores, social media sites, forums, and other sites that have a large amount of page content that changes frequently, a CMS that can produce a site dynamically is a practical necessity.
To ensure that a CMS won’t impede your SEO efforts, the main thing you want to find is a customizable system. You need to be able to change anything and everything on a per-page basis and not have your hands tied. SEO requires a lot of tweaking as you monitor each page’s performance, your competitors’ pages, the user experience on your site, and so forth. You must be able to modify a Title tag here, a Meta keywords tag there.
Here are some attributes to look for when you’re shopping for a CMS:
The shopping list we just laid out can help you pick out a good CMS if you plan to purchase one. Or, if you already have a website that runs on a CMS, the preceding section should help you figure out the strengths or weaknesses of that purchase. Better yet, if your site doesn’t have lots of changing content, you can avoid the CMS issue altogether and code the whole website from scratch! But for those websites that need a content management system, this section gives you tips for making your CMS work for you.
The two main principles are
Creating rules for each of your important SEO elements is a key part of making a CMS work for you. You should be able to define how the CMS puts together the Title tags, Meta description and keywords tags, heading tags, hyperlink anchor text, image Alt attributes, and everything else on your pages.
For instance, if you have an e-commerce store, you have many fields in your database that pertain to each product, such as the product name, product ID, and product description. You’ve also done some categorization work and probably have each product assigned to a product category, style, type, size, color, flavor … you get the idea.
Create rules that define how the Title tag, Meta description tag, and Meta keywords tag should be put together on each product page. These rules should produce tags that meet the best practice guidelines for SEO, including the proper length, capitalization, ordering, and so on. (You can find best practice details in Book V, Chapter 3.)
Also, create rules that apply H# heading tags appropriately throughout your page. Headings should be hierarchical, with an H1 at the top of the page and other heading tags (H2, H3, and so on) throughout the page. Search engines look at the heading tags to confirm that the keywords shown in the Title and Meta tags at the top are accurate, so make sure that they contain the page’s main keywords and are unique to that page.
You should specify rules for every output element possible. You want to take advantage of the CMS’s ability to automate your site, but you also want to control that efficiency. Make sure that your resulting site is search engine–friendly and user-friendly, full of pages that are each unique.
After you have rules set up for how the CMS should construct your pages, the second part is customization. You should be able to tweak individual pages, applying all the SEO principles covered throughout this book as needed. Here are a few scenarios to consider:
SEO is often a balancing act. Those last two bullet points illustrate this — these two scenarios explain why you want to optimize the same shoe product page for specific (long-tail) keywords and for generic keywords simultaneously. To practice effective SEO, you must be able to override the default output created by the CMS and modify individual pages as needed.
Many service providers make it easy for people use to set up an e-commerce storefront quickly. Online options for proprietors range from joining an online marketplace (like Etsy for retailers and Fiverr for those offering creative services) to building your own website. You can run your online storefront on your own domain; some e-commerce platforms offer store owners design templates, easy wizards for inputting products, and functionality to accept credit card, debit card, and PayPal payments. These e-commerce content management systems come in two flavors: hosted and self-hosted. With a self-hosted e-commerce platform, you get software to build and manage your site, but you have to get your own web host. One popular self-hosted e-commerce CMS is Magento. The SEO concerns and considerations for such websites are covered throughout this chapter.
What you should know about SEO considerations of a hosted e-commerce solution (which means that the same company that you use to create and manage your website also hosts it) is a little different. The biggest difference affecting your SEO campaign if you’re using what’s sometimes called an e-commerce service system is a loss of flexibility — you’re limited to the constraints of the shopping cart software and the provided hosting. On the upside, a hosted e-commerce platform is an attractive solution for small and mid-sized retailers because it gives you just one vendor to turn to for all website support. Hosted e-commerce platforms include Yahoo Small Business (https://smallbusiness.yahoo.com/ecommerce
), Shopify (www.shopify.com/
), Squarespace (www.squarespace.com
), and Wix (www.wix.com).
If you’ve got a site built with a hosted e-commerce service, read on: This section shows you how to get the most SEO value out of your hosted online store.
It is possible to make a hosted store rank highly for certain keywords. You can modify things, such as the look of the site, the domain name, and some of the important page elements. When you’re researching hosted e-commerce solutions, check to see whether you have control over as many of the following as possible:
Meta tags: A hosted platform may create Meta description and Meta keywords tags by default. You’ll want the ability to modify them as needed.
For guidelines on how to write effective Title, Meta description, and Meta keywords tags that help your pages rank with the search engines, see Book IV, Chapter 4.
JavaScript frameworks are the new kid on the block in the web development world and are growing in popularity among developers. This method of building a website or application is attractive because JavaScript is a programing language that allows for interactive effects that can look cool and impress your visitors. In the last few years, advancements in modern browsers like Mozilla Firefox and Google Chrome make processing and rendering JavaScript faster than ever. That’s why today it’s possible to build a whole website or application in JavaScript. But hold your horses! Possible is not the same as recommended, and a website built in a JavaScript framework is problematic from a search engine ranking and visibility standpoint.
To understand the problem, you need to understand that JavaScript frameworks are rendered in the browser only. This is different from traditional web frameworks and content management systems, in which every page load is a separate request to the server. When a website is built in a JavaScript framework, the HTML and content are fetched via JavaScript and put together by the browser. This causes problems for search engine bots.
For one thing, search engine bots don’t get the HTML with all the data and content in it. Instead, search engines may see an empty shell of HTML. Among all the search engines, Google has made strides in the area of rendering JavaScript, but even Google crawlers can’t execute it the same way a browser does. If your site is rendered in JavaScript, search engines could miss your content and index your pages with as little as an empty HTML template. To see how much of your JavaScript site is lost to Google, use the handy Fetch and Render tool found within the Crawl reports in Google Search Console. Free for all site owners at www.google.com/webmasters/tools/
, the tool simulates what Googlebot does when it arrives at your web page. (Checking to see your site as Google sees it is covered more in
Book IV, Chapter 5.) But don’t forget that there are more search engines than Google, and a website built in a JavaScript framework is definitely not friendly to them.
Another issue for your site’s search engine rankings is that URLs in this scenario are either not unique or are absent altogether in the case of single-page websites. When a JavaScript application needs to get more data, it uses an AJAX request, which doesn’t require a reload of the page to get the data from the server. This means that links to other “pages” and other general assumptions about navigation are thrown out the window! When an entire website’s content is essentially contained on one URL, you can’t organize content into themed silos with landing pages and support pages. You can’t create pages with keyword-specific optimization. You’re basically losing the ability to do all the SEO best practices. Ouch.
If your website falls into the category of a JavaScript framework, all is not lost. You have some ways to help make your website more search engine friendly.
https://prerender.io/
). Basically, you create a copy of your page as JavaScript renders it in a browser by taking HTML “snapshots” and then make those snapshots available to search engines in a specific way. You can find more info in the Google Developers guide to making AJAX applications crawlable: https://developers.google.com/webmasters/ajax-crawling/
.We generally counsel you to stay away from JavaScript frameworks if you want to rank well. But if you have a JavaScript site and are sticking with it, take the steps to help get your content indexed and make ranking possible.
18.227.134.232