<<Back to Article

Optimize Flash for Search Engines With PHP

/skill level/
/viewed/
0 Times

Nothing makes a Flash designer shudder quite like a client who says "Now, this will be indexible by Google, right?"

The unscrupulous Flash designer might say "Yes, as a matter of fact, Google is able to read and index .swf files." Unfortunately, in practice, Google's indexing of .swfs is far less than ideal. Being the scrupulous Flash designer that you are, you are naturally using some sort of database and scripting language combination to feed data to your Flash Movie. Because Google only indexes the first few frames of any .swf it sees, and because your data is being loaded externally anyway, Google's indexing of your .swf is actually counter-productive. Not only does it miss the real data, which is in your database, but it indexes your code. You can imagine many reasons why this is not good.

So, you ask, what do I say to this client who wants both svelte animated content and flawless search engine indexing? Well, my friend, go ahead and assure the client that their animations will be crawled, ranked and clicked.

This first thing to do in this situation is to ask a couple of hard questions:What do I need to get indexed? And, secondly, how is my Flash movie accessing the data I want indexed? For the purposes of example, I'll be assuming that your data is in a database of some sort and that you're pulling it out using a scripting language, then sending the data to Flash to display it. Your index page should also hold the .swf file. If you need some help setting up this sort of system please refer to my article Build_a_Website_With_Flash_and_MySQL_-_Lesson_1.

Now, most scripting languages, such as PHP and ASP, have some sort of built-in agent detection function which can tell you what sort of browser is looking at your site. This is the key to our solution because an agent detection function can also distinguish between a browser (a human) and a robot (a search engine). Using this function we can script a basic if-else conditional statement to deliver what we want to the right person. If the user agent is a browser, and therefore most likely a person, we give that person the Flash movie. If, on the other hand, we detect the user agent to be a search engine bot that's crawling your page, we can feed it the raw data with some minimal XHTML markup. Maybe some XHTML that uses fancy things like <h> tags to indicate importance.

In today's lesson, I'll be showing you how to do this in PHP. Why PHP? Because it is open-source, free and pretty common. While the syntax will be different in other languages, the same methodology should work whether you're using ASP or some other language.

Alright then, let's get started.

Contents

Lots o' Bots

The default installation of PHP has a collection of built-in ways to get user agent information. In fact, a little research with a good search engine will return literally hundreds of ways to gather browser information using PHP.

Since we're not interested in the browser's eye color or measurements, we'll stick to a very basic method. The one we're interested in is called $HTTP_USER_AGENT and it returns the name of the user agent as a string. If you're interested in returning more complete or sophisticated information about a user's browser, check out the function get_browser() in the PHP manual. The get_browser() function is a handy one to know if you're working with CSS because it gives a method of working around browser formatting inconsistencies. Since we simply need a way to see if it's a browser or a robot looking at the .swf file, $HTTP_USER_AGENT will do.

In order to single out those robots, we're going to create a PHP file. Fire up your favorite text editor and copy this code.

 <?php
 $botlist = array("alexa", "appie", "Ask Jeeves", "crawler", "FAST",
 "froogle", "Firefly", "girafabot", "Googlebot", "InfoSeek", "inktomi",
 "looksmart", "NationalDirectory", "rabaz", "Scooter", "Slurp", "Spade",
 "TECNOSEEK", "Teoma", "WebBug", "WebFindBot", "URL_Spider_SQL",
 "ZyBorg");
 
                 function detectBrowser($agent) {
                                      if (eregi("botlist", $agent)) {
                                         $browser = "Bot";
                        } else {
                                         $browser = "Browser";
                        }
                                  return $browser;
                 }
 ?>

At this point, you can either save the file as detect.inc and include it using PHP?s include() function or you could even make it's own class. Since my scenario only has one file, I'm going to save it as index.php and then add the rest of the code to it later.

Let's examine this detectBrowser() function and see what it does. Simply put, it takes an input variable $agent and uses the eregi() function to test it against the array $botlist, which is our defined list of search engine bots. The eregi() function is PHP's built-in regular expression function. Eregi - the "i" at the end signifies case-insensitive searching, "ereg" would be case sensitive - takes the passed parameter and compares it to all the options in the $botlist array. If it finds a match the variable, then $browser is set to "Bot." If there is no match, then we can assume the requesting agent is a browser, and the variable $browser is set to "Browser". Lastly, the function returns the variable $browser to wherever the function was called from. In this case, we'll be calling it later in the same file, but we could move the variable outside our main file if need be.

The workhorse of this code is actually the array $botlist, but it's important that you realize the list I have provided here is woefully incomplete. It recognizes only a few bots, when in fact there are literally tens of thousands of bots crawling around the web. For a more complete listing (almost 30,000 bots and a whole class library of browser detection functions in several languages) check out Gary Keith's list. Unfortunately, using an array of 30,000 bots in your code would slow your page down to a snail's pace. I suggest looking over your server log files (Gather User Data From Server Logs), checking out what bots are regularly crawling your site, then assemble your own $botlist array.

Detect and Serve

Now that we have our detectBrowser() function all set up and ready to go, how do we use it? All we need to do add the following code to our index.php file:


 $user_agent = $HTTP_SERVER_VARS["HTTP_USER_AGENT"];

 $isBrowser = detectBrowser($user_agent);

 if ($isBrowser=="Bot") {

     mysql_connect ("hostname", "username", "password");

                 mysql_select_db ("bananaINC");

                 $result = mysql_query ("SELECT * FROM *");

                 //Code to render the results in nicely structured html or xml for the  

 robot to index

 } else {

     //Display HTML with Flash object/embed code so our user see the the  

 Flash movie 

 }

I've omitted the HTML in this example, but what you're serving depends on the structure of your own pages. For the raw code section, you just need to insert the PHP query you use to generate your Flash content, and for the Flash embed section you can paste in the code generated by Flash when you published your movie.

The way we have things set up now, we're giving our curious search engine bots one simple page containing all of the information in our Flash movie. While this is better than nothing at all, it makes it somewhat more cumbersome for our end user to get at the info they actually might be interested in.

Let's say we have a Flash movie for our website at bananas.com. Our Flash movie has three sections which are movie clips. Each clip remains hidden until the user selects that section from the navigation menu. For bananas.com, we present these clips in the "Home", "About" and "Products" sections. The chief product of bananas.com is super-tasty bananas. All the data for each section is stored in a database called bananasINC and divided into tables. Each data table has the same name as the section it corresponds to.

So, as it is now, our PHP script dumps all the data into one HTML file for the robot to index, which, like a good little robot, it does. Now our potential customer does a search for vendors of super-tasty bananas and our site comes up first. The potential customer dutifully clicks the link and lands on our page. We sniff them out with our script, and their user agent string identifies them as (most likely) human, so they get our Flash presentation.

So far so good, but less than ideal.

They came to bananas.com hungry for bananas and they landed on the "Home" section of our Flash movie. Now, there may be some degree of information about bananas in the "Home" section, but what the potential customer really wanted was the "Products" section, which features mouth watering photos of bananas. Since our content is wrapped in Flash, we have to do a little bit of extra work to get the monkey... er, potential customer, to the desired page. We could simply trust our customers to click the "Products" section in our navigation menu after the Flash piece loads, but wouldn't it be cooler if we could just send them there to begin with?

Well, guess what? We can.

Engineering a Banana Split

The trick lies in creating a separate version of the same search-indexible version of each section within our Flash frontend.

We need to create several pages that utilize our bot detection script, but this time, we make more specific queries to our database and place the results in the versions that we show to the robot. Instead of displaying all our database info for the whole site on one page, we'll break it up into sections that correspond to the sections in our Flash movie. We now have three pages of data, each linked to one another for robots to trawl through. This simple trick allows the search engines to better index our content.

The resultant code for our "Products" page might look something like this:


 <?php

 /*

 detectBrowser() code goes here

 */

 $user_agent = $HTTP_SERVER_VARS["HTTP_USER_AGENT"];

 $isBrowser = detectBrowser($user_agent);

 if ($isBrowser=="Bot") {

     mysql_connect ("hostname", "username", "password");

                 mysql_select_db ("bananasINC");

                 $result = mysql_query ("SELECT * FROM products");

                 // and some code to render the results in html or xml

 } else {

     //flash embed code

 }

 ?>

 

Notice that we've added a specific data table to our query statement. In this case it is "products.? The code for our "About" page would be similar, but would read "SELECT * FROM about". Repeat this step for as many sections as our Flash movie contains.

Using FlashVars

Now that's all well and good, but how do we tell Flash which page to load once the customer follows the link to our site from the search engine? Let's say the search engine indexes our product page at the address http://www.bananas.com/products.html. When the user clicks the link within their search results, the page they get loads up the Flash movie. Flash movies tend to be rather low in the intelligence department, and there's no way it's going to know which section to flip to unless we tell it.

To tell Flash this sort of information, we have to use something called FlashVars. FlashVars is a parameter of the object/embed tags that we use to stick the movie into the HTML page that contains it.

But where do we grab the value to pass to FlashVars? Well, we could hard code each page so that our Flash embed code reads like this:


 <?php

 /*

 detectBrowser() code omitted for simplicity

 */

 $user_agent = $HTTP_SERVER_VARS["HTTP_USER_AGENT"];

 $isBrowser = detectBrowser($user_agent);

 if ($isBrowser=="Bot") {

     mysql_connect ("hostname", "username", "password");

                 mysql_select_db ("bananasINC");

                 $result = mysql_query ("SELECT * FROM products");

                 // and some code to render the results in html or xml

 } else {

     <EMBED src="http://www.wired.com/images/archivendex.swf" FlashVars="section=products">

     </EMBED>

     //etc

 }

 ?>

 

All we've done is add the parameter FlashVars to the <embed> tag. You should also add it to the <object> tag, but I've omitted that code for the sake of brevity. This parameter is passed to our Flash movie, thus taking potential customers straight to the content they want.

This method has a great advantage in that the potential customer can bookmark that page and always return to the desired section of our Flash movie. The big disadvantage? It's a whole lot more work for us as developers to put this code on every page we wish to be individually indexed. The simpler way to do this is to have one page - the index.php of bananas.com - be the only page that has the Flash movie embedded in it. Every other page, instead of showing the Flash movie to browsers, redirects the user to index.html and passes the name of the page as a GET or POST parameter.

For example, here is the final mockup of our products page:


 <?php

 /*

 detectBrowser() code omitted for simplicity

 */

 $user_agent = $HTTP_SERVER_VARS["HTTP_USER_AGENT"];

 $isBrowser = detectBrowser($user_agent);

 if ($isBrowser=="Bot") {

     mysql_connect ("hostname", "username", "password");

                 mysql_select_db ("bananasINC");

                 $result = mysql_query ("SELECT * FROM products");

                 // and some code to render the results in html or xml

 } else {

                 $URL="http://www.bananas.com?section=products";

                 header ("Location:$URL");

 }

 ?>

 

This will display all of our database-culled code to a robot and redirect the user to the main page while passing the relevant section of the Flash movie as a variable appended to the URL. I've used GET since it has the advantage of exposing the code for bookmarking. Naturally there are circumstances where this exposure would not be wanted, or the variable data would be too long for GET, so POST would be the better option.

Now our index.html page code should look like this:


 <?php

 /*
 detectBrowser() code omitted for simplicity
 */


 // add the following line to grab the GET string

 // that was passed by the redirect code on products.html

 

 $section = $_GET['section'];

 $user_agent = $HTTP_SERVER_VARS["HTTP_USER_AGENT"];

 $isBrowser = detectBrowser($user_agent);

 if ($isBrowser=="Bot") {

     mysql_connect ("hostname", "username", "password");

                 mysql_select_db ("bananasINC");

                 $result = mysql_query ("SELECT * FROM home");

                 // and some code to render the results in html or xml

 } else {

                 //here is some more complete embed code

                 <OBJECT classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"

 codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/ 

 swflash.cab#version=6,0,29,0" ID=sign_on WIDTH=100% HEIGHT=100%>

                   <PARAM NAME=movie VALUE="index.swf">

                   

                   //add the param FlashVars and set the value equal to a variable named  

 section 

                    //with a value of the PHP variable set above. This variable is now

                   //accessible from inside flash.

                   

                   <PARAM NAME="FlashVars" VALUE="section=$section">

                   <PARAM NAME=quality VALUE=high>

                   <PARAM NAME=menu VALUE=false>

                   <PARAM NAME=bgcolor VALUE=#000000>

                   <EMBED src="http://www.wired.com/images/archivendex.swf"

                   

                   //add the same FlashVars param to the embed tag for cross browser  

 compatibility

                   

  <EMBED src="http://www.wired.com/images/archiveocument.swf" FlashVars="section=$section" quality=high  

 menu=false bgcolor=#000000 WIDTH=100% HEIGHT=100%  

 TYPE="application/x-shockwave-flash"  

 PLUGINSPAGE="http://www.macromedia.com/shockwave/download/index.cgi? 

 P1_Prod_Version=ShockwaveFlash"></EMBED>

  </OBJECT>

  ?>

 

This bit inside the <embed> and <object> code utilizes the FlashVars parameter. It creates a variable named "section" inside our Flash movie and set its value to "products". Now we just need to make our Flash movie check for this variable and behave accordingly. The easiest thing to do is put some code on the first frame of the movie that reads something like this:

 if (section==undefined) {

                 //continue on as usual

 } else if (section!=undefined) {

                 display(section)

 }

This code checks to see if "section" got tagged with an assigned value - is our user being redirected here, or is this a direct request? If "section" has no value, we take the user to the "Home" area of the movie. If there is a value associated with "section", we pass it to a function called display(). The exact code for the display function is up to you, but presumably it would be similar to whatever code you use to switch between the sections in the movie.

Peel Slowly and See

And there you have it! Our potential customer goes straight to the mouth watering photos of super-tasty bananas and begins to drool excitedly. The best part is that our PHP work around is non-intrusive. It's unlikely most users will ever know what happened. As a bonus, they can still bookmark the page with the GET query so they can return whenever they like and get back to the exact place they left off.

Applying PHP and Flash solves our client's original demand for search engine indexing, and it also makes the site more useable.

There are other methods one can use to do this. To see a different technique in action, try clicking this link which leads to a page within ultrashock.com's site. The page will redirect you in a manner similar to the one we've worked out in this article. The actual code ultrashock.com uses is a little different, but it should help you visualize other ways of arriving at the same goal.

Some usability folks might argue that the momentary pause during a redirect is disorienting. Perhaps, but I believe the negative effect of such a small distraction is outweighed by the enormous potential gain.

Should you choose to take this idea out into the wild and use it the next time a client really really, really wants a Flash site, keep a few things in mind. First, there have been reports of some strange behavior with FlashVars and gotoAndPlay() methods when such code is put in the first frame of a movie. In such a case, it's probably best to put it on the second or third frame and leave the first one or two frames blank. Also, using this technique effectively requires sitting down and planning out your data structure carefully to make sure that everything corresponds nicely between Flash and your database.

And, finally, please don't sell bananas over the internet.

What you'll need

1.jpeg

2.jpeg

3.jpeg

4.jpeg
  • This page was last modified 11:09, 17 November 2009.
Edit this article
Reddit Digg