As we referenced several times earlier in this chapter, Flash is popular on the Web, but each presents challenges to the search engines in terms of indexing the related content. This creates a gap between the user experience with a site and what the search engines can find on that site.
It used to be that search engines did not index Flash content at all. In June 2008, Google announced that it was offering improved indexing of this content (http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html). This announcement indicates that Google can index text content and find and follow links within Flash files. However, Google still cannot tell what is contained in images within the Flash file. Here are some reasons why Flash is still not fully SEO-friendly:
This is the same problem you encounter with AJAX-based pages. You could have unique frames, movies within movies, and so on that appear to be completely unique portions of the Flash site, yet there’s often no way to link to these individual elements.
Google can index the output files in the SWF file to see
words and phrases, but in Flash, a lot of your text is not inside
clean <h1>
or <p>
tags; it is jumbled up into
half-phrases for graphical effects and will often be output in the
incorrect order. Worse still are text effects that often require
“breaking” words apart into individual letters to animate
them.
A lot of Flash content is only linked to by other Flash content wrapped inside shell Flash pages. This line of links, where no other internal or external URLs are referencing the interior content, means some very low PageRank/link juice documents. Even if they manage to stay in the main index, they probably won’t rank for anything.
An all-Flash site might get a large number of links to the home page, but interior pages almost always suffer. For embeddable Flash content, it is the HTML host page earning those links when they do come.
Anchor text, headlines, bold/strong text, img
alt
attributes, and even title tags are
not simple elements to properly include in Flash. Developing Flash
with SEO in mind is just more difficult than doing it in HTML. In
addition, it is not part of the cultural lexicon of the Flash
development world.
Google has indicated that it doesn’t execute external JavaScript calls (which many Flash-based sites use) or index the content from external files called by Flash (which, again, a lot of Flash sites rely on). These limitations could severely impact what a visitor can see versus what Googlebot can index.
Note that it used to be that you could not test the crawlability of Flash, but the Adobe Search Engine SDK does allow you to get an idea as to how the search engines will see your Flash file.
If Flash is a requirement for whatever reason, there are best practices you can implement to make your site more accessible to search engine spiders. What follows are some guidelines on how to obtain the best possible results.
Beginning with Adobe/Macromedia Flash version 8, there has been support for the addition of title and description meta tags to any .swf file. Not all search engines are able to read these tags yet, but it is likely that they will soon. Get into the habit of adding accurate, keyword-rich title tags and meta tags to files now so that as search engines begin accessing them, your existing .swf files will already have them in place.
Flash developers may find the SDK useful for server-based text and link extraction and conversion purposes, or for client-side testing of their Flash content against the basic Adobe (formerly Macromedia) Flash Search Engine SDK code.
Tests have shown that Google and other major search engines now extract some textual content from Flash .swf files. It is unknown whether Google and others have implemented Adobe’s specific Search Engine SDK technology into their spiders, or whether they are using some other code to extract the textual content. Again, tests suggest that what Google is parsing from a given .swf file is very close to what can be extracted manually using the Search Engine SDK.
The primary application of Adobe’s Search Engine SDK is in the desktop testing of .swf files to see what search engines are extracting from a given file. The program cannot extract files directly from the Web; the .swf file must be saved to a local hard drive. The program is DOS-based and must be run in the DOS Command Prompt using DOS commands.
By running a .swf file through the Flash SDK swf2html program during development, the textual assets of the file can be edited or augmented to address the best possible SEO practices—honing in primarily on keywords and phrases along with high-quality links. Because of the nature of the Flash program and the way in which it deals with both text and animation, it is challenging to get exacting, quality SEO results. The goal is to create the best possible SEO results within the limitations of the Flash program and the individual Flash animation rather than to attempt the creation of an all-encompassing SEO campaign. Extracted content from Flash should be seen as one tool among many in a larger SEO campaign.
There are several things to keep in mind when preparing Flash files for SEO:
Search engines currently do not read traced text (using the
trace()
function) or text that
has been transformed into a shape in Flash (as opposed to actual
characters). Only character-based text that is active in the Flash
stage will be read (see Figure 6-42).
Animated or affected text often creates duplicate content. Static text in Flash movies is not read as duplicate instances that “tweening” and other effects can create. Use static text, especially with important content, so that search engines do not perceive the output as spam (see Figure 6-43).
Search engine spiders do not see dynamically loaded content (text added from an external source, such as an XML file).
The font size of text does not affect search engines; they read any size font.
Special characters such as <, >, &, and " are converted to HTML character references (< > & and ") and should be avoided.
Search engines find and extract all URLs stored within the
getURL()
command.
Search engines have the ability to follow links in Flash, though it is an “iffy” proposition at best. They will not, however, follow links to other Flash .swf files. (This is different from loading child .swf files into a parent .swf file.) Therefore, links in Flash should always point to HTML pages, not other .swf files.
Because “alternative content” workarounds for SEO of Flash files have been historically abused by spammers, it is challenging to recommend these tactics to optimize your Flash files without a critical disclaimer.
Both the SWFObject
and
NoScript
methods were originally
designed to be legitimate, graceful degradation tactics readily
accepted by the search engines as a way to accommodate older browsers
or people with special needs. But many unscrupulous sites have used
the code to trick search engine spiders. In other words, they are used
in such a way that browsers display one thing to users, but display
something completely different to search engine spiders. All of the
major search engines disapprove of such tactics.
Websites using such methods today are often penalized or removed from search engine indexes altogether. This makes graceful degradation risky on some level, but if the methods are used clearly within the boundaries for which they were intended, getting penalized or banned is highly unlikely.
Intent is an essential element search engines take into consideration. If your intent is to provide all users with a positive experience while visiting your site, you should be fine. If your intent is to game the search engines, all it takes is one online rival to report your site for spam to incur the wrath of the search engines.
Google and other search engines do not algorithmically ban sites
for using SWFObject
and NoScript
tags; it usually requires human
intervention to evoke a penalty or outright ban.
With regard to Flash optimization, SWFObject
is the better of the two options
because it is JavaScript code designed specifically for Flash
.swf purposes, and it has been
abused to a lesser extent than the NoScript
tag option.
SWFObject
is Flash
detection code written in JavaScript that checks whether a browser
has the Flash plug-in. If the browser does have the Flash plug-in,
the .swf file is displayed
secondary to that detection. If the browser does not have the Flash
plug-in or the JavaScript to detect it, the primary, alternative
content contained within <div>
files is displayed instead.
The key here is that search engine spiders do not render the
JavaScript. They read the primary content in the <div>
tags.
The opportunity for abuse is obvious upon viewing the code.
This small piece of code is placed within the <head>
tags:
<script type="text/javascript" src="swfobject.js"></script>
In the body of the text, the code looks something like Figure 6-44.
Search engine spiders will read text, links, and even alt
attributes within the <div>
tags, but the browser will not
display them unless the Flash plug-in isn’t installed (about 95% of
browsers now have the plug-in) or JavaScript isn’t available.
Once again, the key to successfully implementing SWFObject
is to use it to the letter of
the law; leverage it to mirror the content of your Flash .swf file exactly.
Do not use it to add content, keywords, graphics, or links that are
not contained in the file. Remember, a human being will be making
the call as to whether your use of SWFObject
is proper and in accordance with
that search engine’s guidelines. If you design the outcome to
provide the best possible user experience, and your intent is
not to game the search engines, you are
probably OK.
You can download the SWFObject
JavaScript free of charge at
http://code.google.com/p/swfobject/. Included
in this download is the flashobject.js file, which is placed in
the same directory as the web pages upon which the corresponding
calling code resides.
The NoScript
tag has been
abused in “black hat” SEO attempts so frequently that caution should
be taken when using it. Just as SWFObject
and DIV
tags can be misused for link and
keyword stuffing, so too can the NoScript
tag. Certain companies have
promoted the misuse of the NoScript
tag widely; consequently there
have been many more problems with their use.
With that being said, conservative and proper use of NoScript
tags specifically with Flash
.swf files can be an acceptable
and good way to get content mirrored to a Flash file read by search
engine spiders. As it is with SWFObject
and corresponding DIV
tags, content must echo that of the
Flash .swf movie exactly. Do
not use them to add content, keywords, graphics, or links that are
not in the movie. Again, it is a human call as to whether a site or
individual page is banned for the use or misuse of NoScript
tags.
You use NoScript
tags with
Flash .swf files in the
following manner:
<script type="text/javascript" src="YourFlashFile.swf"></script>
Followed at some point afterward by:
<noscript> <H1>Mirror content in Flash file here.</H1> <p>Any content within the NoScript tags will be read by the search engine spiders, including links http://www.mirroredlink.com, graphics, and corresponding Alt attributes. </noscript>
For browsers that do not have JavaScript installed or
functioning, content alternatives to JavaScript-required entities
are displayed. So, for use with Flash .swf files, if a browser does not have
JavaScript and therefore cannot display Flash, it displays instead
the content inside the NoScript
tags. This is a legitimate, graceful degradation design. For SEO
purposes, as it is with SWFObject
, the search engine spiders do
not render the JavaScript and do read the content contained in the
HTML. Here, it is the content in the NoScript
tags.
sIFR is a technique that uses JavaScript to read in HTML text and render it in Flash instead. The essential fact to focus on here is that the method guarantees that the HTML content and the Flash content are identical. One great use for this is to render headline text in an anti-aliased font (this is the purpose for which sIFR was designed). This can provide a great improvement in the presentation of your site.
At a Search Engine Marketing New England (SEMNE) event in July 2007, Dan Crow, head of Google’s Crawl team, said that as long as this technique is used in moderation, it is OK. However, extensive use of sIFR could be interpreted as a poor site quality signal. Since sIFR was not really designed for large-scale use, such extensive use would not be wise in any event.
As we mentioned previously, the search engines cannot see any content that does not appear immediately in the source code for the web page. Content and links that are dynamically retrieved in response to a user action or some other event and then embedded in the page cannot be seen by the search engines. However, content that is embedded in the HTML of the page, and JavaScript that is simply used to render it, can be seen by the search engines. There is also some evidence that Google follows links implemented in JavaScript, as detailed in the article on SEOmoz at http://www.seomoz.org/ugc/new-reality-google-follows-links-in-javascript-4930.
AJAX implementations can also be very challenging. The main challenge is that most implementations render new content on the page, but the URL does not change. To build AJAX applications that present indexable content to the search engines, make sure to create new URLs that point to the various versions of the content, and then make sure you link to those locations from somewhere on your site.
Use of the NoScript
tags to
render alternative content for the benefit of the search engines is
useful in addressing basic JavaScript applications, but less useful with
AJAX, since there may be an infinite (or very large) number of different
types of output from the tool, and it would not be feasible to render
them all within the NoScript
tags.
18.116.21.152