ASP 101 - Active Server Pages 101 - Web01
The Place ASP Developers Go!

Please visit our partners


Windows Technology Windows Technology
15 Seconds
4GuysFromRolla.com
ASP 101
ASP Wire
VB Forums
VB Wire
WinDrivers.com
internet.commerce internet.commerce
Partners & Affiliates














ASP 101 is an
internet.com site
ASP 101 is an internet.com site
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

ASP 101 News Flash ASP 101 News Flash



 Top ASP 101 Stories Top ASP 101 Stories
The Top 10 ASP Links @ Microsoft.com
What is Adovbs.inc and Why Do I Need It?
An Overview of ASP.NET

QUICK TIP:
Convert Colors Between RGB and Hex
Show All Tips >>
ASP 101 RSS Feed ASP 101 Updates


The Wonderful World of Data Shaping

The Wonderful World of Data Shaping

by Chris Payne

Introduction

Data shaping is a little known feature of ADO 2.0 and above that allows you to create hierarchical recordsets, which can give you tremendous power over manipulating and displaying data. Unfortunately, data shaping has taken a back seat for many developers.

Data shaping lets you create parent-child relationships between recordsets, i.e. recordsets within recordsets, all with a single ADO object. This is a dream for a web developer, because it eliminates messy joins, complicated filtering, and the need for spaghetti-code in presentation logic. This in turn reduces the amount of traffic crossing a network and reduces overhead when interfacing with other tools, such as XML.

This article will discuss the theories and applications behind data shaping, its raison d^etre, with beginning to advanced examples, and assumes that you have a good basic knowledge of database methodology and SQL syntax.

What is Database Hierarchy?

Imagine a set of tables that hold information on a product database. Some items beg to be related in a parent-child relationship, for instance, manufacturers and products. Each manufacturer can have several products, or children. Since each product can only belong to one manufacturer; it will have only one parent. This is the idea behind relational databases, something which has been in use for quite some time.

In practice, however, things aren't so easy. We can set all kinds of foreign key relationships to enforce this bond between tables, but those relationships don't do much for you except to make sure you don't input random information. We need a way to take this parent-child relationship to the next step, and take it out of theory and put it into practice.

Why Hierarchy?

Suppose you had three tables for a product database: a manufacturer table, a product table, and transaction table, described below (note that these tables are simple and solely for demonstration):

tblManufacturer
Column Name Description
MID Unique identifier for manufacturers
Name Manufacturer name
Address Manufacturer's address
tblProducts
Column Name Description
PID Unique identifier for products
MID Foreign key to tblManufacturer
Name Product name
Instock_Flag Out of stock flag
tblTransaction
Column Name Description
TID Unique identifier for transactions
MID Foreign key to tblManufacturer
PID Foreign key to tblProducts
Type Transaction Type

These tables would hold information about manufacturers, the products each manufacturer sells, and the individual transactions associated with each, for example, a sell or buy. Both tblTransaction and tblProducts could have a many-to-one relationship with tblManufacturer.

If you wanted to display this information to a user, you might want to relate the information somehow. Typically, this could be done with SQL join statements:

SELECT tblProducts.Name, tblManufacturer.Name, Type FROM tblManufacturer JOIN tblProducts ON tblManufacturer.MID = tblProducts.MID JOIN tblTransaction ON tblManufacturer.MID = tblTransaction.MID AND tblProducts.PID = tblTransaction.PID

Or simply:

SELECT tblProducts.Name, tblManufacturer.PID, tblManufacturer.Name, Type FROM tblManufacturer, tblProducts, tblTransaction WHERE tblManufacturer.MID = tblProducts.MID AND tblManufacturer.MID = tblTransaction.MID AND tblProducts.PID = tblTransaction.PID

This would display something like the following:

tblProducts.Name tblManufacturer.Name tblManufacturer.PID Type
Rubber band ACME Rubber Band Co 1 Buy
Big Rubber band ACME Rubber Band Co 1 Sell
Pencil Northeast Pencil Inc 2 Sell
Sunglasses Shades Inc 3 Sell
... ... ... ...

As you can see, you get a pretty flat recordset in return, with potentially a lot of duplicate information. While this may be appropriate for some applications, it doesn't give you very much flexibility - you can't display the information in any other way than what's displayed above, and updating or inserting any information would be a chore. Often times, an ASP application needs more flexibility, especially in the display of information. This type of recordset would benefit from a way to easily relate the different tables and fields.

The Power of Hierarchy

Using data shaping, you could manipulate the above information in all sorts of wonderful ways. As discussed above, we'll relate manufacturers and products to each other as parent and child (we'll discuss transactions a bit later). Here's the syntax to create a basic data shape:

SHAPE {parent-command} [[AS] table-alias]
APPEND {child-command}
RELATE(parent-column TO child-column)

The parent and child-commands can be any valid SQL clause that returns a recordset, for instance, a SELECT statement. Note however, that SQL is not required - you can use any query language you wish. Let's try an example:

SHAPE { SELECT MID, Name, Address FROM tblManufacturer }
APPEND ({ SELECT Name, Instock_flag FROM tblProducts } AS Products
RELATE MID to MID )

This query produces the following results:

MID Name Address Products
1 ACME Rubber Band Co Dayton, OH (reference to Products chapter)
2 Northeast Pencil Inc Boston, MA (reference to Products chapter)
3 Shades Inc Boise, ID (reference to Products chapter)
... ... ... ...

Appending a child recordset onto our first parent recordset produces a new field in the parent recordset, called Products. This field contains references to pre-filtered recordsets. Each reference contains only the products that belong to the specified manufacturer!

Also, notice the use of the word chapter. Child recordsets are called chapters to the shaping provider.

Here is some code in VBScript that presents the information to the user.

<%
Dim strSQL, rst, strConnectionString, rstChild
strConnectionString = "Provider=MSDataShape;dsn=MyDSN;database=MyDatabase"
strSQL = "SHAPE { SELECT MID, Name, Address FROM tblManufacturer } " & _
    "APPEND ({ SELECT Name, Instock_flag FROM tblProducts } " & _
    "AS Products RELATE MID to MID )"
Set rst = Server.CreateObject("ADODB.Recordset")
'open recordset
rst.Open strSQL, strConnectionString
Do While Not rst.EOF
    Response.Write(rst("MID") & ", ")
    Response.Write(rst("Name") & "<br> ")
    Response.Write("Products:")
    'set object to child recordset and iterate through
    Set rstChild = rst("Products").Value
    If Not rstChild.EOF Then
        Do While Not rstChild.EOF
            Response.Write("   " & rstChild("Name"))
            If rstChild("InStock_Flag") Then
                Response.Write("In stock")
            Else
                Response.Write("Out of stock")
            End If
            rstChild.MoveNext
        Loop
        rst.MoveNext
    End If
Loop
rst.Close
Set rst = Nothing
%>

This code would produce the following output:

1, Acme Rubber Band Co
	Products:
		Rubber band, In stock
		Big Rubber band, In stock
		Small Rubber band, Out of stock
		...
2, Northeast Pencil Inc
	Products:
		Pencil, Out of stock
		Yellow pencil, In stock
		...
3, Shades, Inc
	Products:
		Sunglasses, Out of stock
		Shades, In stock
		...

We set an object equal to the reference column in the parent recordset, and iterate through that as we would a normal recordset object. The results returned are only applicable to the current parent item (i.e. only products belonging to the specific manufacturer are returned). This eliminates the need to create multiple recordset objects and form sub queries for each one.

Also note the extra clause in the connection string, "Provider=MSDataShape". The provider defaults to "MSDASQL," which normally supplies the necessary data, but is unable to interpret SHAPE commands. Therefore we must tell the ADO objects to use the MSDataShape provider instead -- not specifying this will result in an error.

As you can see, this is an efficient, well formed method of displaying data -- we don't have duplicate information, everything is already grouped properly, and we don't have a headache.

Complex Enough For Ya?

Before we move onto more complex examples, let's take a closer look at data shaping.

In theory, a shaped recordset may contain the following types of fields:

  • Regular data (just like a regular recordset)
  • Pointers to other recordsets (children or chapters)
  • Calculations on a row of the recordset
  • Calculations over a column of the recordset
  • Empty columns (in case you need to keep track of some temporary information while you work with a recordset)

The most important of these is probably the pointers to other recordsets -- the true power of data shaping.

The best thing about data shaping is that you aren't limited to one child, or even one level of children. You can have multiple children, grand children, great grandchildren, etc. When you get into large databases, this feature comes in handy. They syntax for adding multiple children is as follows:

SHAPE {parent-command} [[AS] table-alias]
APPEND {child-command}
RELATE(parent-column TO child-column) {,
({child SQL statement} [[AS] alias]
RELATE parent column TO child column) [[AS] alias]}

An actual query would look like this:

SHAPE {SELECT MID, Name, Address FROM tblManufacturer}
APPEND
({SELECT Name, Instock_Flag FROM tblProducts} AS Products RELATE MID TO MID),
({SELECT MID, PID, Type FROM tblTransaction} AS Transaction RELATE MID TO MID)

By adding another loop to your ASP script, you can now list all transactions for manufacturers as well.

1, Acme Rubber Band Co
	Products:
		Rubber band, In stock
		...
	Transactions:
		Sell
		...
2, Northeast Pencil Inc
	Products:
		Pencil, Out of stock
		...
	Transactions:
		Sell
		...

This is definitely very useful, but wouldn't it make more sense to relate the transactions to products as well? Enter grandchildren:

SHAPE {parent-command} [[AS] table-alias]
APPEND (
( SHAPE {child SQL statement} [[AS] alias]
APPEND ( {grandchild SQL statement } [[AS] alias]
RELATE child column TO grandchild column) [[AS] alias] )
RELATE parent column TO child column) [[AS] alias]

As you can see, you simply embed another shape statement within the append statement of the first shape command. The output would look like:

1, Acme Rubber Band Co
	Products:
		Rubber band, In stock
			Transactions:
				1, Buy
				2, Sell
				...
		Big Rubber band, In stock
			Transactions:
				1, Sell
				2, Sell
				...

Now we have products sorted by manufacturers, and transactions sorted by manufacturers and products. Of course, if we stored more pertinent information in our databases, you would be able to easily see some very interesting data, such as buy/sell trends, but these tables are simply for demonstration purposes.

Note, however, that data shaping isn't only useful for one-to-many relationships, such as the ones we've been describing above, but that they can also be very helpful in one-to-one relationships that you would normally handle with SQL JOINS. If we stored the manufacturers' address in another table, for instance, it would be easy to simply append the address table to the manufacturers' as a child. The display of such data might also be more logical. Just another way to really put into use the concepts of normalized and relational databases.

1, Acme Rubber Band Co
	Products:
		Rubber band, In stock
			Transactions:
				1, Buy
	Address:
		Dayton, OH

You can easily see how complex and wonderful data shaping can get with relatively simple commands.

Computation

Using a shape compute command changes things a bit. First, I'll outline the syntax:

SHAPE {child-command} [[AS] table-alias]
COMPUTE aggregate-command-field-list
[BY grp-field-list]

This seems counter-intuitive from what we just learned. What happens is that the compute statement generates an implied parent recordset -- you don't have to explicitly specify one yourself. For instance:

SHAPE { SELECT MID, Name, Address FROM tblManufacturer } AS Manufacturer
COMPUTE Manufacturer BY Name

This statement would produce the following parent recordset:

Manufacturer Name
(reference to Man. chapter) ACME Rubber Band Co
(reference to Man. chapter) Northeast Pencil Inc
(reference to Man. chapter) Shades Inc
.... ...

In essence, we are creating a parent recordset with references to children that are built from the SQL Select statement we issued, and grouping each item by manufacturer name. The child recordset for ACME Rubber Band Co would look as follows:

MID Name Address
1 ACME Rubber Band Co Dayton, OH

This is exactly what is produced by the SQL statement we specified, filtered by the parent recordset Name. Pretty neat huh?

What exactly is the application for this? Why don't we just use a normal shape command? For one thing, the COMPUTE statement and underlying data here are very simple. Once these start getting even mildly complex, this statement can help tremendously. Secondly, it can help you group data in ways that you normally wouldn't think of. Once we start nesting SHAPE commands (i.e. creating grand children), you can organize the hierarchy by any level in the hierarchy. Meaning the top recordset, or parent, doesn't necessarily have to be the top most select statement.

A Search Engine

Imagine a search engine, where we can search for manufacturers. But, the user would rather search for the type of industry a manufacturer is in, before choosing from a list of manufacturers. How do we group this in a logical fashion? Take a look at the following statement, which assumes we have a new "Industry" table:

SHAPE (
SHAPE { SELECT MID, IID, Industry FROM tblIndustry }
APPEND (
{ SELECT MID, Manufacturer FROM tblManufacturer WHERE (Manufacturer LIKE '%" & searchstring & "%') } AS Manufacturer
RELATE MID to MID ) ) AS RS
COMPUTE RS BY Industry

Suppose the user types in "band" as the search query. The returned display could look like:

Rubber
	ACME Rubber Band Co
	Asian Rubber Bands
Clothing
	Clothes Band, Inc
	HatBand Co
Entertainment
	I-Band Entertainment
	...

Or simply:

Rubber
Clothing
Entertainment
...

We have returned the matching manufacturers and industries. The user may wonder how you gave him information that has nothing to do with the search term he supplied. Magic at its finest follks.

Conclusion

We can easily see how data shaping provides a powerful mechanism for logical manipulation of relational information. And we've barely touched on it in this article. Make sure to check out the resources below to get you started, but the best way to explore data shaping's potential is by jumping in and applying it to many (if not all) of your current database applications. Many times you'll see that data shaping is the way to go.

Resources

Check out these links for further resources on data shaping syntax and methodology:


Home |  News |  Samples |  Articles |  Lessons |  Resources |  Forum |  Links |  Search |  Feedback

Internet.com
The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers