I was recently working with an old classic ASP file manipulation system when I started
getting some odd errors. I'd been using the same scripts for years and knew that nothing
in the application had changed, but I kept getting odd characters popping up
where I never had before.
This particular application works as follows. It retrieves a data file from
a Web site, does some manipulation of the file, and then sends an email using
the information harvested from the retrieved file.
Since I knew nothing in the application had changed, I decided to take a closer
look at the data file. At first glance everything looked fine. I requested the
file in my browser and couldn't detect any problems. I then modified the script
so it would save temporary files as it processed the data file to try and
find out where in the process the extra characters were being introduced. Still
no luck. When I opened these temp files in Notepad they looked fine as well.
I was starting to get really frustrated. I started thinking it had to be the
mail server, but a few tests later I had ruled that out and was back at square one.
I finally decided to see what would happen if I started with an empty data file.
So I fired off Visual Studio, published the absolute minimum data file that
would still process, and launched the application. It pulled down the file,
processed it, wrote out the temporary debugging files, and sent the email.
When the message arrived in my Inbox, I found the same strange characters.
It wasn't until I went back to look at the temp files that I found the problem.
By using a minimal source file, a couple of the temporary files the script created
should have been empty, but when I checked their properties I found that one of them was
3 bytes in size. When I opened the file, it certainly appeared empty, but there was
obviously something I wasn't seeing.
After a few minutes of Web searching I came across the answer: Unicode files.
Visual Studio, Notepad, Web servers, and Web browsers all handle them fine, but
classic ASP doesn't particularly like the byte-order mark
in some situations.
Tracing things back, the problem actually started when the data files started being created with
Visual Studio 2008. As I was doing a little more research, I discovered that some people
have run into similar problems when dealing with XML files created in Visual Studio.
If you want to see what I'm talking about first hand, try creating an empty
text or XML file in your prefered flavor of Visual Studio 2008.
Save the file and then check its properties from Windows Explorer. You'll find that the
file size is actually 3 bytes when you'd expect it to be zero.
To get rid of the extra bytes, all you need to do is save the file with a different encoding.
To do this, do a "File -> Save As...". In the "Save File As"
dialog box, you'll notice that the
"Save" button has an odd arrow on it. If you click the arrow, you'll get the option
to "Save with Encoding..." and choose the encoding you prefer. For those of you that
aren't sure, the encoding that probably offers the best backwards compatability (at least
for US users) is "Western European (Windows) - Codepage 1252".