In the previous chapters, we've been looking at ADO from an application-neutral and language-neutral perspective. We've taken advantage of the fact that the basic principles of using ADO are the same, irrespective of the type of
application you are building or the programming language you use to build it. This is intentional, and is one of the core concepts of ADO itself. You can use it in a whole range of ways, with any language that can instantiate and use COM objects.
However, there is one specific application of ADO that is a core part of the technology.
RemoteDataAccess is intended to broaden the appeal of ADO in scenarios where the client and server are disconnected-in the sense that there is no permanent physical connection between them. This is very much the case with Internet-based applications, where (until the new HTTP proposals are universally supported), the client and server cannot maintain state between connections. Each request from a single client is entirely separate from all other requests before and after the current one, and the server and client must create state artificially.
In this chapter, we'll look at what we mean by
disconnectedapplications and state in more detail. We'll also see how one particular aspect of Remote Data Access, the Microsoft Remote Data Service (RDS), is such an important new technology. This chapter covers:
An overview of what remote data access is all about
A look at the different kinds of remote data access technologies
How we can implement remote data access in a Web page
How we can bind data to HTML controls in a Web page
Ways of creating remote recordsets directly using RDS and ADO
To begin, we look at the background and future of Remote Data Access as a whole, the current implementation of XML data remoting, and RDS. It's important to understand the concepts before we go on to look at its use in more detail.
What Is Remote Data Access?
Remote Data Access provides an opportunity to create applications that will appear to a client over the Internet (and more particularly over the Web) to work in much the same way as traditional client-server applications do over a local area network (LAN). In this section of the chapter, we'll look at why this is more difficult to achieve than you might at first assume, and how we can get round the problems.
Applications That Work Over the Internet
In a traditional LAN-based environment, a client logs on to a server by identifying itself to that server. The server then authenticates the client using their username and password (or through another method such as a smart-card or fingerprint recognition). However it's done, the result is that the server identifies the client as a valid user, and can then maintain a connection between itself and that client over the network. When data is sent to the server from the client, the server can automatically identify that specific particular client, and then send the appropriate data back to it.
This connection is effectively a permanent link between client and server during the time that the client is logged on. It provides
state, in that the client and server can identify each other through the protocols supported by the network that connects them.
Once an application is moved to the Internet, or more specifically to the Web, this process fails. The HTTP protocol is
The Windows DNA Architecture
Just maintaining state is not enough, however. This book is about working with data from a data source, such as a relational database, and our data management applications may have to work over the Internet. ADO is specifically designed to be a part of the Windows
Distributed interNet Applications (DNA) architecture. This is a Microsoft design methodology aimed at creating applications that will work in a disconnected and stateless environment like the Internet.
In this book, we aren't specifically covering DNA, or the provision of state in distributed applications. For more information about these topics, look out for
Professional ASP 2.0 Programming (ISBN 1-861001-26-6) and Professional MTS/MSMQ Programming (ISBN 1-861001-46-0) from Wrox.
So, we've discovered that building Web-based applications that handle data requires different techniques to traditional client-server applications on a LAN. Remote Data Access provides ways to recreate the LAN-based environment out on the Internet, in line with the DNA methodology. We'll look at the techniques it provides in more detail next.
Disconnected Data Management
The following diagram shows three data management scenarios. The first is the traditional client-server LAN-based environment, but here we are using ADO to provide the connection between the application (which carries out the processing tasks on the data) and the data store itself. In this case, the application and the user interface communicate directly. They could both be client-based, or the application components could be distributed between the client and the server. However, the point is that the various parts of the application communicate directly using the LAN. This automatically provides state:
In the second section of the diagram, we move the user interface for the application out across the Internet. By using a Web browser or a custom application, we can communicate with a server-based application using the HTTP protocol of the Web. The application might be implemented using Active Server Pages, custom components; or it might be a more traditional CGI or ISAPI application written in any of a range of languages. This is the common scenario for Web-based data management applications today. In effect, all we've done is introduce a remote-control feature that allows users on the 'Net to work with the application on the server.
Active Server Pages (ASP), the Common Gateway Interface (CGI), and the Internet Server Application Programming Interface (ISAPI) are all ways to interface with the Web server and provide a server-based application that can communicate with a browser or other remote Web-based interface.
The third section of the previous diagram shows how Remote Data Access can extend the application itself across the Internet, rather than just the interface. What it does is move the data manipulation tasks, and hence the data itself, out to the client. Now, ADO communicates with the back-end data store across the 'Net. We've effectively moved the entire application out to the client, leaving only the data store on the server.
This is very much the way that some traditional client-server data management applications work. When you open a form to work with data in, say, Microsoft Access, a copy of the data is transferred across the network to your machine from the server. You then manipulate it using your client-side application, and save the changes back to the server again over the network. Remote Data Access allows a very similar technique to be used when the network is the Internet and the client is a Web browser or custom application.
The Pros and Cons of Remote Data Access
So, Remote Data Access provides us with an opportunity to extend our data management applications over the Internet. This overcomes a major problem with the currently popular scenario (the second one in our earlier diagram), where the remote interface must pass requests across the network to the server each time a different task is carried out on the data, or when a different view of the data is required. Remote Data Access can help to avoid the heavy network traffic that arises from even the most simple data manipulation tasks.
For example, a remote user is viewing a list of clients from a database and wishes to sort the list in a different order. The traditional method of using ADO with ASP or a custom application on the server means that the entire recordset must be transferred over the Web for each sort. Using Remote Data Access however, the recordset is cached locally and can be sorted, filtered, and viewed in a range of ways with no further requests to the server and no network traffic.
However, this has to be balanced with the requirements of the application. The customer list may contain several thousands of records. If the client is searching for a single record, it makes more sense to use the traditional technique of building a recordset containing just this record on the server with ADO and sending it to the client as a pure HTML page, rather than sending the whole set of records over the Internet. In many applications, you will need to consider a mixture of techniques, depending on the way the data will be used.
One of the factors that will affect the decision you make is the level of support for RDS in the various browsers. At present, only Internet Explorer 4 and above fully support RDS.
Using ADO on the Client
So far, this may seem just an interesting extension of the connectivity technologies provided by ADO. However, it provides us with an opportunity to provide more interactive and responsive applications, compared to the more usual Web-based data handling scenarios. As well as reducing the number of round-trips to the server, it allows us to access data programmatically while it is cached on the client.
The following diagram shows the impact of this in more detail. What we've achieved is to move the ADO programming interface from the server out onto the client, while leaving the original OLE-DB data provider interface on the server. In effect what this means is that, instead of creating the recordset with ADO and working with it on the server, we can now create it on the client as a cached local recordset and work with it there:
So, your ADO programming skills aren't wasted when you start using Remote Data Access methods. With a few minor exceptions (mainly due to the fact that the recordset is disconnected from the server), ADO techniques work exactly the same way on the client as they do on the server. The main difference is that the operations affect only the client recordset. As we'll see, however, Remote Data Service (RDS) allows these updates to be flushed back to the source data store, by submitting all the changes in one operation.
Remote Data Access Technologies
The technologies that implement remote data access are still maturing, and change with almost monotonous regularity. The topic is also made more complex by the fact that its very nature involves two separated platforms-your data server and the client browser or application. In fact, the whole concept of remote data access in the Microsoft world is gradually evolving into two separate areas, the
Remote Data Service (RDS) and XML Data Remoting.
We'll look at the concepts and the basic implementation of both of these in this section of the chapter. They are too complex to cover in their entirety, and this is not the topic at which the book is aimed. However, it is easier to appreciate the future directions of remote data access when you understand the basic issues.
The Remote Data Service (RDS)
Having looked earlier at an overview of what remote data access is designed to achieve, we'll take a more detailed view of how it's implemented in RDS. The major limitation is that, once we move the 'working parts' out from the server to the client, we rely on a far more specific environment to be in place there. What this means at present is that only custom Windows applications, and/or Microsoft Internet Explorer 4 (or higher) browser, can be used as the client with RDS.
Things are also made more complex by the fact that there are two versions of RDS in common use. The version that is supplied with Internet Explorer 4 is V1.5, while Internet Explorer 5 and Visual Studio 6 contain V2.0. To get round this problem, the V2.0 Data Access Software Development Kit (available from
http://www.microsoft.com/data/mdac2.htm) provides tools that you can use to distribute the V2.0 client-side component files automatically to users who access your data via IE4.
Inside Remote Data Service
RDS takes advantage of several COM-based objects that are distributed between the client and the server. The version 2.0 client components are included with Internet Explorer 5, and with the latest releases of the client-side development environments like Microsoft Visual InterDev, Visual Basic and Visual C++. The server-side objects will also be provided with Internet Information Server through Windows NT5. However, at the time of writing, the NT4 Option Pack only contained version 1.5 of the components, and you need to download the version 2.0 components from
The RDS Object Structure
The following diagram shows how the main objects used by RDS are distributed between the client and the server. On the server, a connection is made to the data store through an OLE-DB provider or a combined ODBC/OLE-DB driver (which allows a connection to be made to a data store that only provides an ODBC interface).
The provider or driver supplies data to either the special RDS
DataFactory object or to a custom component running on the server. The DataFactory object provides a default method of marshalling the data into a form suitable for transmission across the Internet.
The data is then passed to the Web server software and out across the Internet to the RDS
DataSpace object on the client. This object implements the connection to the DataFactory object or custom component through a proxy and stub (the normal COM/DCOM communication method), handles marshalling of the data across the network, and recreates it as a recordset on the client.
Once received by the client, the recordset is cached locally and passed to the client-side
DataControl object. Finally, if required, the Data Binding Agent object takes the data and binds it to controls on the client. This last step is usually specific to a Web page, where the client is a browser or a custom application that hosts an instance of the browser interface. The bound controls on the page then reflect the data content from the recordset. This is not mandatory, as script or other objects within the page can use the data directly, rather than binding it to controls in the page:
If the bound controls allow updating of the data (i.e. they are HTML
INPUT, SELECT, TEXTAREA, etc. controls), the updates are stored in the locally cached recordset on the client. The DataControl object can then pass the updates back to the server on command, and the DataFactory object (or custom component) can update the source data.
Updating the Source Data with RDS
RDS allows the source data to be updated on the client using the same kind of approach as in a traditional LAN-based application. The user can make multiple changes to the recordset that is cached locally on their machine, then submit all the changes back to the server where the source data is updated with all the changes in one operation.
The RDS data source object provides a method
SubmitChanges, which marshals the changed records into a recordset, and sends it back to the DataFactory object on the server. This automatically updates the original source data with the changes. However, this remote access capability introduces some major concerns with regard to data security.
Data Security with RDS
When a data management application is entirely server-based, the client only sends instructions on how the data should be manipulated. It never sees the data source directly, or the connection information. This is all hidden inside an ASP page, a custom component, or whichever other technique you use to communicate with the data source from your server-based application.
With RDS, however, the client is effectively communicating directly with the data source. This means that the data source connection information is moved from the server to the client. The instance of ADO running on the client can connect to the data store and manipulate the source data directly. The result is that you reveal the connection information to the user, and may risk them using this information to access the data store in ways that were not intended.
It's imperative, therefore, to use the capabilities of your data store to limit access to the contents. The usual methods are to set relevant permissions for users for each data source and each object within that data source (i.e. for each table, field, view, procedure, etc.). The techniques for doing so depend on the data store you use, and are outside the scope of this book.
If you use SQL Server or Oracle integrated security with RDS in an Intranet scenario, no connection username and password need be revealed to the client. You can also use custom components in place of the default
DataFactory object to limit client access to a data store.
The DataFactory Handler
In version 2.0 of RDS, some new capabilities have been added to the
DataFactory object to control the access that clients have to the source data. A text INI file named msdfmap.ini is placed in your Winnt (or Windows) folder when version 2.0 of the Microsoft Data Access Components (MDAC) are installed. This defines the settings that are applied to the DataFactory when it is instantiated by a client connection.
The default settings depend on the option you choose when you install the components, and can be set to provide no access at all until you edit the file. Alternatively, you can set it up to allow free access to all connections, and then edit it once installation is complete and you are ready to deploy your application.
msdfmap.ini is the configuration file for the default DataFactory handler, named MSDFMAP.Handler. This is used unless you specify a different custom handler, which is done by setting the Handler property of the DataControl object or Recorset object. The RDS documentation contains details of how this can be done.
The Future - XML Data Remoting
While passing data in the form of marshaled recordsets across the Web using RDS is a neat trick, the future of data transport lies in
Extensible Markup Language (XML). If we are going to realize the dreams of truly universal data access across disparate platforms and operating systems, we have to use a data format that is independent of the platform itself. It would also be nice if the standard was set and maintained by an independent industry body, as it is then more likely to have universal support. Both of these conditions rule out RDS.
Instead, XML-a text-based format for describing data-is moving more and more into the mainstream. Microsoft provide several data source objects that can handle XML-formatted data with Internet Explorer 4 and 5. Netscape have also announced that the next version of Navigator will provide support for XML in line with evolving standards. In the meantime, the World Wide Web Consortium (W3C) have ratified version 1.0 of XML, and are working on several other proposed standards to describe the way data can be defined within XML.
Microsoft Internet Explorer 5 and XML
At the time of writing, Internet Explorer 5 was still in beta, but already supports XML in many useful ways. The browser itself is an XML data source object, and can expose XML-formatted data directly to code in the page, or to the Data Binding Agent that binds the data to HTML controls in the page. The current implementation revolves around the new HTML element
<XML> is an HTML element, not an XML element. It defines a section in an HTML document that contains XML, and is not itself part of that XML data.
<XML> element can be used to create a data island, where the XML is embedded in the HTML page. This is an example of an inline XML data island:
.. HTML code here .. <XML ID="mydata"> .. the XML formatted data goes here ..
.. this is an XML data island within a HTML page .. </XML> .. more HTML code here ..
If the XML data is in a separate file, we can use a
SRC attribute within the <XML> tag to link to it, as in the next section of code:
.. HTML code here .. <XML ID="mydata" SRC="/data/myxmldata.xml"></XML> .. more HTML code here ..
Once the page has loaded, together with the content of any linked XML files, the data described by XML is exposed to ADO as a recordset. This can be used on the client in almost exactly the same way as the recordset exposed by the RDS
DataControl object that we looked at earlier. However, XML data can also be manipulated using ExtensibleStylesheetLanguage (XSL). This is another of the W3C standards currently under development, and it is implemented in Internet Explorer 5.
For more information about XML and XSL, look out for the Wrox Press books Professional XML Applications (ISBN 1-861001-52-5) and IE5 XML and XSL Programmer's Reference (ISBN 1-861001-57-6). You can also find more information about XML support in Internet Explorer, and some useful tools and examples, from the Microsoft XML site at