It would be nice to think that once we've set up our site, we can sit back and just keep tweaking the content and adding the occasional new items to it on demand-while it all keeps working transparently in the background. This kind of ideal world only exists in textbooks, and not in reality. We're only too aware of how easy it is for your site to get 'broken' by your, or other people's actions. Even the biggest sites, that have large development teams and no obvious shortage of resources, suffer broken links.
As an example, if your site has links to another site outside your direct control, you can find that suddenly the links are broken. The other site may have moved some pages around, changed their main site URL, or just gone out of business altogether. And they didn't have the decency to let you know! But, there again, do you know who provides links to your site? If you change the URL of a page on your site, do you email all the other sites that link to it to tell them?
In this chapter we'll tackle the issues that arise when you provide links to other sites, and they provide links to you. And of course we also need to think about how we ensure that we don't break any links that are wholly within our own site when we move pages around. We'll also look at how we can collect opinions from our visitors about our site. In particular we'll consider:
Ways of preventing errors and broken links appearing on your site
How we can create custom error pages to catch broken links or other errors
How we can log inter-site navigation and other errors, when they do occur
Ways of checking that other sites we provide links to are still available
How we can provide feedback and collect opinions from our visitors
We'll start with a look at how we can try and prevent errors or broken links appearing on our site in the first place.
Designing to Prevent Errors
No matter how well you plan and manage the development of your Web site, you are going to have at least a few errors or broken links appearing at some stage. It might not always be your fault. The Web depends on the unimaginably complex mass of inter-site and inter-page links to make it what it is. Maintaining and updating your share of these links is often the biggest headache of all for the site administrator. In this part of the chapter, we'll look at how they can arise, and see some basic procedures that will help to minimize their effects.
A Typical Scenario
Here's a typical scenario. A potential visitor to your site is looking for information on, say, reverse boost accelerator flanges. They go to a search engine and enter the criteria. Back comes a list of suppliers, with your site sitting proudly at the top of the list. They click the link and get this:
OK, so it might be that you have stopped making reverse boost accelerator flanges. However, it might be that the page in the search engine's list referred to purple ones, and you now only make green ones. Even worse, it might be that you just changed the name or location of the page. Whichever is the case, the result is the same. The potential visitor (and customer) will buy their flanges from someone else.
A Better Solution
Here's an alternative scenario. Someone goes to a search engine and finds a link that points to a page on our
Web-Developer site that no longer exists. Instead of the impersonal
Found error message, this is what they see:
As well as an acknowledgement that our site still exists, the visitor gets a helpful message and (more importantly) a couple of links to follow. This is crucial because, having attracted them to our site, we want to keep them there. From this page, they can go to our
Home page or to a map of our site. You'll see how we implement custom
errorpages like this later in the chapter.
Provide a Site Map
Most Web users have got used to the fact that pages (and sites) move around, and they accept this. However, once they get to our site-particularly if they are looking for something specific-we need to make things easy to find. The common answer is a site map page. This can be as simple as a list of pages, or as complex as a clickable graphical representation of the site.
Web-Developer site uses a page that is mid-way between these styles. It provides a description of each section of the site, including the kind of content it contains, followed by links to the main pages within each section. It's certainly more attractive that just a list of links, and hopefully easier to use as well:
Missing Images and Broken Links
Keeping track of links between pages can be hard enough, but keeping track of image file links can be even harder. Placing all of your site's images in one folder helps, because at least then they all have the same path. However, if the image isn't there, getting the right path doesn't help. There isn't much that looks worse than a page with missing images:
To make matters worse, the page you see in the previous screenshot is even more broken than it appears. The first and third links both have incorrect URLs, and so clicking on them results in an error. The only redeeming factor is that they do bring up our custom error page, so the user can go to the site map or home page to find the resource they want.
Recording the Errors
While it's generally easy to find missing image errors, because they are so visible in the page, they can arise without warning. If you change a graphic for one page, and delete the original one, do you remember to check for any other pages that use the old graphic? You might not visit that page for a while to find the 'missing image' symbol. Even worse, broken links to other pages are not visible when you just load a page. It's only when you click on a link that you find the error.
The 404 Not Found HTTP Header
This is where the custom error page you saw earlier in this chapter is so useful. When any broken link is activated, whether it's a link to another page or just a link to an image within a page, Internet Information Server loads our custom error page.
Notice that the error page-either the standard version or our custom one-is still sent back to the browser when the link is to an image on the page, rather than a link to another page. The viewer doesn't see the error page, however, because the Web server accompanies it with a
Found header in the response (see Chapter 2 for a discussion of HTTP headers). The browser interprets this header and, because it knows that the request is for an image, displays the 'missing image' icon instead.
So our custom error page is still loaded by the Web server and sent to the browser for every broken link, be it a missing page, a missing image, or any other missing resource that the browser requests-perhaps a Java applet or an ActiveX control. Inside our custom error page is code that records the fact that there was an error, and the reason why the error occurred, in a database on our server. The next screenshot shows the page that our administrators can view to monitor for errors. This is the result after loading the
Articles page with it's two missing graphics you saw earlier, and after clicking on the two broken links:
You can see the entries for the missing graphics, which appear twice because we went back to the page after getting the first
Found error, and the entries for the broken page links. Later in this chapter we'll look at the whole process of logging broken links, and the way we can display and manage the results. You'll see how the pages shown here are created.