ASP 101 - Active Server Pages 101 - Web06
The Place ASP Developers Go!



Windows Technology Windows Technology
15 Seconds
4GuysFromRolla.com
ASP 101
ASP Wire
VB Forums
VB Wire
WinDrivers.com
internet.commerce internet.commerce
Partners & Affiliates
ASP 101 is an
internet.com site
ASP 101 is an internet.com site
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

ASP 101 News Flash ASP 101 News Flash



 Top ASP 101 Stories Top ASP 101 Stories
The Top 10 ASP Links @ Microsoft.com
What is Adovbs.inc and Why Do I Need It?
An Overview of ASP.NET

QUICK TIP:
Handling "File In Use" Errors when Updating an Access DB
Show All Tips >>
ASP 101 RSS Feed ASP 101 Updates


Two Classes to Improve File System Access in .NET

by Christopher Surfleet

Introduction

I have felt for a while that although the classes available in the System.IO namespace did an excellent job allowing you to edit and process single files, there has been a significant lack of anything to help you deal with groups of files or directories. In an attempt to address this issue I have created two classes, FileInfoCollection and DirectoryInfoCollection, which allow access to files and directories using several powerful methods.

What Do They Do That I Can't Do Already?

Currently, the only way to deal with a collection of FileInfo or DirectoryInfo objects is to simply create an array of them. This works fine for simple tasks, but I wanted more control over these collections. Our two new classes will allow:

  1. Files or folders from several directories to be easily dealt with together.
  2. Sorting by several standard properties, e.g. Name, Path, File Size, or using a custom sorter class.
  3. Several ways of searching for files/directories within the collection, including a method to search by Regular Expression.

From this point on, in order to keep things simple, I will be discussing the FileInfoCollection only. The DirectoryInfoCollection is very similar in structure and function and so, instead of repeating all the explanations twice, I'll focus on the FileInfoCollection and have simply included the DirectoryInfoCollection as source code for you to take a look at.

Creating Powerful Collection Classes - Extending CollectionBase

When you want to create a collection class (e.g. A class which contains several items of another class), you will want to inherit from a class in the System.Collections namespace. In almost all cases, that class will be CollectionBase.

The MSDN .NET documentation tells us that CollectionBase provides the abstract (MustInherit in Visual Basic) base class for a strongly typed collection. That doesn't really help much does it? To explain a little, abstract means that you cannot instantiate the class directly, so the code in Listing 1 would not work:

Listing 1
CollectionBase objCollection = new CollectionBase();

The only way to use an abstract class is to inherit from it. If you have not encountered inheritance before then you'll probably find it helpful to take a look at this article on MSDN: Inheritance from a Base Class in Microsoft .NET before continuing.

Inheriting from CollectionBase immediately gives your class loads of useful properties and methods such as a count of all the items in the collection and the Clear() method, which removes all items from the collection. The most useful item in CollectionBase however, is the protected property InnerList, which allows us to implement most of the standard functionality of a collection such as adding and removing objects. (Protected means that only a class that inherits the object can use the property, so people will not be able to use FileInfoCollection.InnerList in their code but we can still use it within our class.)

Listing 2 shows the basic layout of our class.

Listing 2
using System;
using System.Collections;
using System.IO;
using System.Text.RegularExpressions;
namespace CustomControls.IO
{
	public class FileInfoCollection : CollectionBase
	{
		public FileInfoCollection() { /* do nothing */ }
	}
}

Pretty standard stuff so far. The next thing we need to do is add methods for adding and removing items from the FileInfoCollection. This can be accomplished easily using the methods within InnerList. All we have to do is ensure that we are accepting the right kind of object, so that nobody can drop a string or a hyperlink into the collection. The code in Listing 3 shows the methods that should be added to your class.

Listing 3
public void Add(FileInfo Item)
{
	this.InnerList.Add(Item);
}
public void Remove(FileInfo Item)
{
	this.InnerList.Remove(Item);
}
public void AddRange(FileInfo[] Items)
{
	this.InnerList.AddRange(Items);
}
public void AddRange(FileInfoCollection Items)
{
	this.InnerList.AddRange(Items);
}

As you can see, we are simply using methods in InnerList to do the real work. The InnerList's Add and Remove methods take an object, and since our FileInfo inherits from object (as does everything else in .NET) we can pass a FileInfo object to these methods. All our methods have to do now is ensure that they only accept FileInfo objects, easy!

The InnerList's AddRange method takes any object that implements the ICollection interface and adds all the objects within it to our collection. Both a FileInfo array, and our own FileInfoCollection implement this interface (CollectionBase implements ICollection, and so we automatically implement it too). This means that we can create two methods to allow either type of object to be added into our collection. Using these methods you can put all the FileInfo Objects found in two or more directories into a single FileInfoCollection and work with them all together (This is useful if you want something like an A-Z list of all documents anywhere within a folder structure). They can then be dealt with as one list, so you can order them all by name. Listing 4 shows how these methods can be used.

Listing 4
FileInfoCollection objFiles = new FileInfoCollection();
DirectoryInfo objFirstDir = new DirectoryInfo("C:\myfirstfolder");
DirectoryInfo objSecondDir = new DirectoryInfo("C:\mysecondfolder");
// Add all the files in the directories into our collection
objFiles.AddRange(objFirstDir.GetFiles());
objFiles.AddRange(objSecondDir.GetFiles());
// If we had another FileInfoCollection with some files in it,
// we could add these to the end of objFiles
objFiles.AddRange(SOME_OTHER_FileInfoCollection);
// We now have all the files from two Directories and another
// FileInfoCollection in one object!

At this point I have also created an additional constructor for the class, which takes a FileInfo array and simply passes it into our AddRange method.

Listing 5
public FileInfoCollection(FileInfo[] Items)
{
	this.AddRange(Items);
}

This allows us to quickly create a FileInfoCollection with the contents of a directory, as shown in Listing 6.

Listing 6
DirectoryInfo objDir = new DirectoryInfo("C:\myfolder");
FileInfoCollection objFiles = new FileInfoCollection(objDir.GetFiles());

We've done almost all the boring stuff now, just one property left to do. So far we can add and remove FileInfo objects from the collection in a variety of ways, however, we can't read any of these objects! That is easily fixed with a single property.

Listing 7
public FileInfo this[int Index]
{
	get { return (FileInfo)this.InnerList[Index]; }
	set { this.InnerList[Index] = value; }
}

This may look a little strange to some of you, but it's really pretty easy. This property will allow us to grab a FileInfo object from a specified location within the collection like you would with any collection. For this reason the property does not have a name, the first line is simply saying "Whenever somebody accesses the object and passes an Integer". We then retrieve or assign to the corresponding FileInfo object in InnerList.

Sorting FileInfoCollection

Now we can get on with the fun bits! How often have you been asked to display a list of files in an awkward order? By file name, by file size, by extension, or by the date that's embedded right in the middle of the file name (when people are feeling REALLY annoying). Well, you are about to write a few lines of code that will make all these troubles disappear! OK, not quite, but I guarantee it will save you loads of time in the long run.

At this point we need to take a minute to understand how .NET sorts objects. There are several ways that values can be sorted, which all have their strengths and weaknesses. The method .NET uses for sorting collections of objects is known as a Bubble Sort. A Bubble Sort works by comparing each object to the one after it, if it is 'larger' than the next object, it is moved past that object and the sort starts again. If it is 'smaller' than the object after it, it is left where it is, and the search moves on to the next object. In this way, the 'smallest' value ends up at the start, while the 'largest' value is at the end. Confused? Here's an example:

Assume that we have an array of integers which looks like this:

34, 6, 12, 2

The sort starts at the beginning and asks "Is 34 larger than 6?", it is, so the array is rearranged to look like this:

6, 34, 12, 2

The sort goes back to the start and asks "Is 6 larger than 34?", it isn't, so the array stays the same and we move onto the next number - 34. The search then asks "Is 34 larger than 12?", it is, so the array is rearranged again:

6, 12, 34, 2

Because the array has been rearranged the search starts from the beginning again. "Is 6 larger than 12?" - no. "Is 12 larger than 34?" - no. "Is 34 larger than 2?" - yes, so rearrange and start the search again:

6, 12, 2, 34

"Is 6 larger than 12?" - no. "Is 12 larger than 2?" - yes rearrange and re-start:

6, 2, 12, 34

"Is 6 larger than 2?" - yes, so swap them and restart:

2, 6, 12, 34

This time the answer to every question is "no", and this tells the sort that it is complete. I should note here that a Bubble Sort is the least efficient method of sorting values and should not be used for lists of hundreds of items.

Now don't worry, you don't have to write code to do all that, luckily CollectionBase does all the hard work for you! All it needs to be told is how to know if one object is larger or smaller than another. It's pretty easy to know that 5 comes after 2, or that Bob is before Phil, but how do we know which file comes after another? Files could be ordered by a few things - Name, File Size etc., so we need to write some code to show CollectionBase which file comes after another. Here's the code to sort files by Name, which should be placed in your class.

Listing 8
internal class FileInfoNameComparer : IComparer
{
	public int Compare(object x, object y)
	{
		FileInfo objX = (FileInfo)x;
		FileInfo objY = (FileInfo)y;
		return objX.Name.CompareTo(objY.Name);
	}
}
public void SortByName()
{
	this.InnerList.Sort(new FileInfoNameComparer());
}

We have created an internal class named FileInfoNameComparer, which implements the IComparer interface. The IComparer interface forces us to have a method called compare, which takes two objects, and returns an int. This method simply compares the two objects, and the returned int tells us which is larger:

  • -1 = y is larger
  • 0 = both objects are equal
  • 1 = x is larger

The code inside the method is really simple, we cast each object to a FileInfo and then use the CompareTo method of the Name string to decide which is larger. You will find that many objects expose CompareTo, this method will compare one string to another and return -1, 0, or 1 in the same way as our compare function. Because of this, we can just return the result of the CompareTo method. (For more information on the CompareTo method, Google for IComparable.) 99% of the time you create an IComparer, this is almost the exact code you will need to write. Note that an error will be thrown if you try to pass anything but FileInfo objects to this method!

All that's left now is to write the method that actually does the sort - SortByName(). Once again, all this method does is call a method from InnerList - Sort. This sort method takes an IComparer object to tell it how to sort the objects. We simply pass in a new instance of a FileInfoNameComparer and we're done. This may seem complicated at first glance, but take a look back at the code - we just sorted all our objects with just 13 lines of code! Once you have done this, it will take you all of 30 seconds to add sorting for any of your other properties.

Searching

All that now remains is to write code for searching through our collection. First up is a Contains function, to search for a FileInfo with a certain FullName property. This simply loops through all the FileInfo objects in the collection, and returns true if it finds a match. If it reaches the end without finding a match, then it returns false.

Listing 9
public bool Contains(string FilePath)
{
	foreach (FileInfo objFile in this)
	{
		if (objFile.FullName == FilePath)
		{
			return true;
		}
	}
	return false;
}

Next we have another implementation of the Contains method, this time taking a FileInfo object, and an IndexOf method, these parameters simply pass through to the same methods within InnerList:

Listing 10
public bool Contains(FileInfo File)
{
	return this.InnerList.Contains(File);
}
public int IndexOf(FileInfo File)
{
	return this.InnerList.IndexOf(File);
}

Finally, we get to the last method! This one is one of my favorites - FilterByName. FilterByName takes a regular expression, and returns a FileInfoCollection containing all the FileInfo objects that match the expression. Say you have the following files in a collection:

  • Press-Release-2004-08-01.pdf
  • Minute-2004-08-12.pdf
  • Minute-2004-08-10.pdf
  • Minute-2004-06-01.pdf
  • Minute-2003-12-15.pdf
  • Timetable-2004-08-10.xls
  • Timetable-2004-06-01.xls

With this function you could find all items from 2004. You can find all minutes from August. You can find all pdf files after July 2004. In short, you can run ANY query against a set of documents. For more information about regular expressions, check out the MSDN reference (Regular Expression Language Elements). Now, let's see how easy it is to implement:

Listing 11
public FileInfoCollection FilterByName(Regex Match)
{
	FileInfoCollection objReturn = new FileInfoCollection();
	foreach (FileInfo objItem in this)
	{
		if (Match.IsMatch(objItem.Name))
		{
			objReturn.Add(objItem);
		}
	}
	return objReturn;
}

It's seriously that simple. We take in a regular expression to match a file name on, then create a new FileInfoCollection, objReturn, to put all matching FileInfo objects into. We loop through all the items in the collection and use the IsMatch method of the regular expression to determine whether to put this FileInfo into objReturn. When we've checked each item in the collection we return objReturn.

You now have a powerful collection class to help you deal with the file system. I'll finish with an example showing how you could use this class in your applications:

Listing 12
// Grab all the files from a directory
FileInfoCollection objFiles =
	new FileInfoCollection(Directory.GetFiles(Server.MapPath("/MyFiles")));
// Filter to only get files whose names contain 2004 and have the .pdf extension.
objFiles = objFiles.FilterByName(new Regex(".+2004.+\\.pdf"));
// Sort the files by size, largest first
objFiles.SortByLength();
objFiles.Reverse();
// If we had a DataGrid on our page, we could display the documents on screen
MyDataGrid.DataSource = objFiles;
MyDataGrid.DataBind();

Enjoy!

Download

The sample code that accompanies this article can be downloaded in zip file format from here: NetFileSystemClasses.zip (2.8 KB).


Home |  News |  Samples |  Articles |  Lessons |  Resources |  Forum |  Links |  Search |  Feedback

Internet.com
The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers