ASP 101 - Active Server Pages 101 - Web01
The Place ASP Developers Go!



Windows Technology Windows Technology
15 Seconds
4GuysFromRolla.com
ASP 101
ASP Wire
VB Forums
VB Wire
WinDrivers.com
internet.commerce internet.commerce
Partners & Affiliates
ASP 101 is an
internet.com site
ASP 101 is an internet.com site
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

ASP 101 News Flash ASP 101 News Flash



 Top ASP 101 Stories Top ASP 101 Stories
VBScript Classes: Part 1 of N
Migrating to ASP.NET
Getting Scripts to Run on a Schedule

QUICK TIP:
ASP Code-Based Security
Show All Tips >>
ASP 101 RSS Feed ASP 101 Updates


Extending Your Page Names With ASP.NET

by Wayne Berry

Prologue

Back in November of 2000, I wrote an article for ASP 101 called "Extending Your Page Names". Based on the number of people that contact me with questions and praise, it was one of the most successful articles I have ever written. Since then we have seen the launch of Windows 2003, IIS 6.0, ASP.NET, and the rise of Google. So I felt it was time to make an update to the article....

Introduction

Dynamically generated pages are the only way to have a truly big site; however dynamic page names are not very user friendly, nor search engine friendly. For example, having a page called:

product.asp?Id=4

is not as memorable as:

/appliances/dishwashers/kenmore/Model3809.htm

This article shows how you can get a static-looking page name and dynamic pages at the same time.

Search Engines

There is another benefit to having well-named pages, search engines are able to traverse these pages. Most search engines, like Alta Vista and Yahoo do not traverse pages with question marks in them, since they are afraid of entering into a never-ending traverse. By converting your dynamic page names to static pages, search engines would categorize these pages driving more traffic to your site.

Another benefit is that some search engines traverse the URL of the page looking for keywords, and rank these keywords with more importance then the text in the HTML. For this reason it is good to have keywords in your URL that match the page description.

Minimum Requirements

In order to get well-named pages , you will need to use Windows 2000/IIS 5.0 or Windows 2003/IIS 6.0 running either ASP.NET or ASP. If you are running ASP please see the former article for the code sample needed. There are two new enhancements in IIS 5.0 that allow you to have great pages names, Custom Error Pages that use Server.Transfer and the ability to use Server.Transfer in our Active Server Pages/ASP.NET. Even though Custom Error Pages were available in IIS 4.0, they used Response.Redirect, which will not work. Response.Redirect will not work because search engines do not follow redirects.

Overview

As the site programmer, you link up pages that don't exist -- presumably with well-named URLs. You then tell IIS that you want an ASP.Net page (404.aspx) to handle all the 404s that come to the site. Inside this ASP.NET page, you convert the original URL to well-named URLs and do a Server.Transfer to execute and return that page to the user's browser.

Getting Started

If you open a browser and type in:

http://www.myserver.com/appliances/dishwashers/kenmore/Model3809.htm

Where you fill in myserver.com with your web site name the page will return 404. The first thing to do is to have all your 404 pages handled by a single .aspx page. You can do this by using the Custom Error Page feature of IIS 5.0/6.0. To turn on custom error pages follow these steps:

  1. Open IIS Manage In MMC.
  2. Right Click on the web site node and choose properties.
  3. Click on the Custom Errors Tab.
  4. Scroll down until you see the HTTP Error -- 404.
  5. Double click on 404 to open the "Error Mapping Properties" dialog.
  6. Change the Message Type to URL.
  7. For the URL enter in /404.aspx
  8. Click OK and Then OK again.

Now all your 404 errors will be handled by 404.aspx. The nice thing about IIS is that when it calls 404.aspx, it will send the page name that caused the error as a parameter in the query string.

404.aspx

Now create a 404.aspx page to handle your errors. The first thing that you need to do is get the name of the page that had the 404. This line of code will get the page name from the query string:

// C#: Get the Page Name
strQ = Request.ServerVariables["QUERY_STRING"].ToString();

So what is important to use in the strQ? In the example above it appears like: /appliances/dishwashers/kenmore/Model3809.htm. All we really need from it is the Model3809, since this could be the unique key to the product database. The following lines of code takes the model number and find the product id.

// C#
string   strParentPageName="";
string   strPageName="";
string   strQ="";
int      nIndex=0;
int      nProductId = 0;
string   strSQL = "";
strQ = Request.ServerVariables["QUERY_STRING"].ToString();
// Decode the HTML
// Strips Out Trash that the URL creators might
// Have Entered
strQ = Regex.Replace(strQ,"[_]"," ");
strQ = Regex.Replace(strQ," "," ");
strQ = Regex.Replace(strQ,@"\+"," ");
strQ = Regex.Replace(strQ,"%20"," ");
strQ = Regex.Replace(strQ,"%C3%A9","é");
strQ = Regex.Replace(strQ,"%C3%A1","á");
strQ = Regex.Replace(strQ,"%27","'");
strQ = Regex.Replace(strQ,"%5C","\\");
// Strip Ending Slash If There Is One
if (strQ[strQ.Length-1]=='/')
{
    strQ = strQ.Substring(0,strQ.Length-1);
}
// Strip The Page Name
nIndex = strQ.LastIndexOf("/");
if (nIndex>0)
{
    strPageName = strQ.Substring(nIndex+1,(strQ.Length-nIndex)-1);
    strQ = strQ.Substring(0,nIndex);
}
// Transfer To the Home Page If There Appears
// To Be a Problem With the URL, maybe a real 404?
if (strPageName.Length<4)
{
   Server.Transfer("/default.aspx");
}
// Trim Off .htm From Location Name
if ((strPageName[strPageName.Length-1]=='m') &&
(strPageName[strPageName.Length-2]=='t') &&
(strPageName[strPageName.Length-3]=='h') &&
(strPageName[strPageName.Length-4]=='.'))
{
	strPageName = strPageName.Substring(0,strPageName.Length-4);
}
// Now We Have The Name of the Page Decode Some More
// Also Prep For A Database Call
strPageName = Regex.Replace(strPageName,"'","''");
strPageName = Regex.Replace(strPageName,"%2F","/");
strPageName = Regex.Replace(strPageName,"%2f","/");
// Lookup the Name of the Page From The Product Database
strSQL = "SELECT Product_Id FROM Category WHERE Product_Name = '" + strPageName + "'";
SqlConnection myConnection = new
	SqlConnection(ConfigurationSettings.AppSettings["ConnectionString"]);
SqlCommand myCommand = new SqlCommand(strSQL, myConnection);
// Execute the command
myConnection.Open();
SqlDataReader result = myCommand.ExecuteReader(CommandBehavior.CloseConnection);
// Not IN The Product Database, Transfer To the Home Page
if (!result.Read())
{
 Server.Transfer("/default.aspx");
}
nProductId = Int32.Parse(result["Product _Id"].ToString());

Now that we have the Product Id we need to store it before transferring to the correct .aspx page.

In ASP we can't pass the Product Id in the query string via the Server.Transfer (this is an IIS restriction). So, we pass it via the Session Object. However, a major difference in ASP.NET is that you can pass the information in a query string with a Server.Transfer.

Server.Transfer("/product.asp?Product_Id=" + nProductId)

When you try this, your address bar in the browser will say:

http://www.myserver.com/appliances/dishwashers/kenmore/Model3809.htm

and because of the Server.Transfer, the URL in the browser's address bar doesn't change, and the browser doesn't have to perform another round trip, unlike Response.Redirect.

Also notice that the directories do not exist at all, in fact it doesn't matter in this case what the rest of the URL says -- except the server name. For instance all these URLs go to the same page:

http://www.myserver.com/Model3809.htm
http://www.myserver.com/trucks/ford/Model3809.htm

So why put in the directories? The directories will give you higher search engine placement. Because some search engines use the words in the query string as stronger keywords to the search than words in the title or body of the HTML, directory names are very important.

Calling Pages that Don't Exist

We have covered the technology to convert URLs that don't exist to dynamic URLs, however in order to get the search engine to traverse those pages you need to link to the URLs that don't exist. In other words, the only way the search engine is going to find your Model3809.htm page is if you link it up.

When you linked this page before all you had to do was use the Product Id like this: " product.asp?Id=4", so lets take that Product Id and create a function that returns the correct URL.

// C#
public String Page1::CreateProductURL(int lProductId)
{
String strURL = "";
String strSQL = "SELECT Product_Model FROM Product WHERE Product_Id =" + lProductId.ToString();
SqlConnection myConnection = new SqlConnection(ConfigurationSettings.AppSettings["ConnectionString"]);
SqlCommand myCommand = new SqlCommand(strSQL, myConnection);
// Execute the command
myConnection.Open();
SqlDataReader result = myCommand.ExecuteReader(CommandBehavior.CloseConnection);
if (!result.Read())
{
  // No Product With This ID, Alert the Page Creator
  throw new Exception("Invalid Product Id: " + lProductId);
}
strURL = "/" + result["Product_Model"].ToString() + ".htm";
myConnection.Close();
return(strURL);
}

Now when you want to add a URL you do it like this:

<A HREF="<%= CreateProductURL(4) %>">DishWasher</A>

Note: In this example we assume you don't know the model name and you have to go back to the database -- in the real world this might not be the case. If you know the model name then you can create the URL without making another call to the database.

Performance

There are some obvious performance issues associated with this technique. First, it might require an extra database call to create the well-named URLs that don't exist. Secondly, it will always require an extra database call to figure out the correct URL from the 404 URL. Finally, the Server.Transfer is expensive.

An Update

Since writing the original article, my brother (Glen Berry) pointed out a way to avoid the extra database call in the 404.aspx that is very clever. Simply put the product identifier in the URL that you create, for example:

Instead of this:

/appliances/dishwashers/kenmore/Model3809.htm

You do this:

/appliances/dishwashers/kenmore/453422/Model3809.htm

Your 404.aspx page would then parse out the top most directory and get 453422. In the example 404.aspx code above the nProductId would equal 453422. This would avoid the database call and make the page faster. However, this makes the URL less readable, but still gives you good search engine placement.

Example

An example of this can be seen with http://www.kulshan.com. Most of the leaf pages on Kulshan.com, like the individual restaurants reviews, use this technique to get better search engine placement.

Summary

Creating URLs that you can market and that work on search engines is fairly easy, if you don't have to create the directory structure and files that those URLs represent. Using IIS 5.0/6.0 Custom Error Page technology and handling the 404s can yield great pages names.

XCache Technologies

Founded in 1996 by Wayne and Dina Berry, XCache Technologies is a lead innovator of Web caching software and Internet performance solutions. The company's flagship products XCache, XCompress and XTune provide Web developers and site owners with robust performance solutions that significantly improve site experiences for their end users. XCache Technologies is headquartered in Bellingham, Washington and can be contacted via their website: http://www.xcache.com.

Related Articles


Home |  News |  Samples |  Articles |  Lessons |  Resources |  Forum |  Links |  Search |  Feedback

Internet.com
The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers