IIS 6.0 WebDAV and Compound Document Format Files

We recently ran into a situation where a customer thought that they were seeing file corruption when they were transferring files from a Windows 7 client to their IIS 6.0 server using WebDAV. More specifically, the file sizes were increasing for several specific file types, and for obvious reasons the checksums for these files would not match for verification. Needless to say this situation caused a great deal of alarm on the WebDAV team when we heard about it - file corruption issues are simply unacceptable.

To alleviate any fears, I should tell you right up front that no corruption was actually taking place, and the increase in file size was easily explained once we discovered what was really going on. All of that being said, I thought that a detailed explanation of the scenario would make a great blog entry in case anyone else runs into the situation.

The Customer Scenario

First of all, the customer was copying installation files using a batch file over WebDAV; more specifically the batch file was copying a collection of MSI and MST files. After the batch file copied the files to the destination server it would call the command-line comp utility to compare the files. Each MSI and MST file that was copied would increase by a small number of bytes so the comparison would fail. The customer computed checksums for the files to troubleshoot the issue and found that the checksums for the files on the source and destination did not match. Armed with this knowledge the customer contacted Microsoft for support, and eventually I got involved and I explained what the situation was.

The Architecture Explanation

Windows has a type of file format called a Compound Document, and many Windows applications make use of this file format. For example, several Microsoft Office file formats prior to Office 2007 used a compound document format to store information.

A compound document file is somewhat analogous to a file-based database, or in some situations like a mini file system that is hosted inside another file system. In the case of an MST or MSI file these are both true: MST and MSI files store information in various database-style tables with rows and columns, and they also store files for installation.

With that in mind, here's a behind-the-scenes view of WebDAV in IIS 6.0:

The WebDAV protocol extension allows you to store information in "properties", and copying files over the WebDAV redirector stores several properties about a file when it sends the file to the server. If you were to examine a protocol trace for the WebDAV traffic between a Windows 7 client and an IIS server, you will see the PUT command for the document followed by several PROPPATCH commands for the properties.

IIS needs a way to store the properties for a file in a way where they will remain associated with the file in question, so the big question is - where do you store properties?

In IIS 7 we have a simple property provider that stores the properties in a file named "properties.dav," but for IIS 5.0 and IIS 6.0 WebDAV code we chose to write the properties in the compound document file format because there are lots of APIs for doing so. Here's the way that it works in IIS 5 and IIS 6.0:

  • If the file is already in the compound document file format, IIS simply adds the WebDAV properties to the existing file. This data will not be used by the application that created the file - it will only be used by WebDAV. This is exactly what the customer was seeing - the file size was increasing because the WebDAV properties were being added to the compound document.
  • For other files, WebDAV stores a compound document in an NTFS alternate data stream that is attached to the file. You will never see this additional data from any directory listing, and the file size doesn't change because it's in an alternate data stream.

So believe it or not, no harm is done by modifying a compound document file to store the WebDAV properties. Each application that wants to pull information from a compound document file simply asks for the data that it wants, so adding additional data to a compound document file in this scenario was essentially expected behavior. I know that this may seem counter-intuitive, but it's actually by design. ;-]

The Resolution

Once I was able to explain what was actually taking place, the customer was able to verify that their MST and MSI files still worked exactly as expected. Once again, no harm was done by adding the WebDAV properties to the compound document files.

You needn't take my word for this, you can easily verify this yourself. Here's a simple test: Word 2003 documents (*.DOC not *.DOCX) are in the compound document file format. So if you were to create a Word 2003 document and then copy that document to an IIS 6.0 server over WebDAV, you'll notice that the file size increases by several bytes. That being said, if you open the document in Word, you will see no corruption - the file contains the same data that you had originally entered.

I hope this helps. ;-]

Life after FPSE (Part 2)

Following up on my last blog post, today's blog post will discuss some of the highlights and pitfalls that I have seen while transitioning from using the FrontPage Server Extensions to publish web sites to WebDAV. It should be noted, of course, that FTP still works everywhere - e.g. Expression Web, FrontPage, Visual Studio, etc. As the Program Manager for both WebDAV and FTP in IIS I can honestly say that I love both technologies, but I'm understandably biased. <grin> That said, I'm quite partial to publishing over HTTP whenever possible, and Windows makes it easy to do because Windows ships with a built-in WebDAV redirector that enables you to map a drive to a web site that is using WebDAV.

To set the mood for today's blog, let's have a little fun at FPSE's expense...

No-FPSE

A Few Notes about Migrating FPSE Web Sites to WebDAV and Backwards Compatibility

To start things off, I wrote a detailed walkthrough with instructions regarding how to migrate a site that is using FPSE to WebDAV that is located at the following URL:

Migrating FPSE Sites to WebDAV
http://go.microsoft.com/fwlink/?LinkId=108347

I wrote that walkthrough from the point-of-view that you might want to preserve the FPSE-related metadata in order to open your web site using a tool like Visual Studio or FrontPage. Neither of these tools have native WebDAV support, so you have to map a drive to a WebDAV-enabled web site in order to use those tools, and the instructions in that walkthrough will lead you through the steps to make the FrontPage-related metadata available to those applications over WebDAV.

The part of that walkthrough that makes backwards compatibility work is where I discuss adding settings for the IIS 7 Request Filtering feature so that FPSE-related metadata files are blocked from normal HTTP requests, but still available to WebDAV. (These metadata settings are all kept in the folders with names like _vti_cnf, _vti_pvt, etc.)

It should be noted, however, that if you are not interested in backwards compatibility, the steps are much simpler. In Step 1 of the walkthrough, you would choose "Full Uninstall" as the removal option, and all of your _vti_nnn folders will be deleted. If you've already removed FPSE from a web site and you chose the "Uninstall" option, you can remove the _vti_nnn folders from your site by saving the following batch file as "_vti_rmv.cmd" in the root folder of you web site and then running it:

dir /ad /b /s _vti_???>_vti_rmv.txt

for /f "delims=;" %%i in (_vti_rmv.txt) do rd /q /s "%%i"

del _vti_rmv.txt

It's worth noting, of course, that this batch file can be pretty disastrous if run in the wrong web site, as FPSE will no longer be able to access any of the metadata that defined your web site. Any content stored in folders like _private, fpdb, _overlay, etc., will all be preserved.

Getting to Know the WebDAV Redirector

Windows Vista and Windows Server 2008 both ship a first-class director, making it easy to use WebDAV sites across the Internet as though they were local shares. Using the WebDAV director is as intuitive as mapping a drive to any UNC share, you just specify the drive letter and the destination URL:

MapDrive

If you prefer, you can also use the command-line to map a drive to a WebDAV site:

net use * http://www.example.com/

Enter the user name for 'www.example.com': msbob

Enter the password for www.example.com: ******

Drive Z: is now connected to http://www.example.com/.

The command completed successfully.

Rather than repeat myself any more than necessary, I wrote the following walkthrough for anyone that plans on using the WebDAV redirector:

That walkthrough discusses how to install the redirector if necessary, how to map drives to WebDAV sites, and how to troubleshoot any problems that you might see.

Microsoft Expression Web - Using a WebDAV-enabled Editor

One of my favorite publishing features in Expression Web is that it has native WebDAV support built-in, so it doesn't have a dependency on the WebDAV redirector in order to work with a WebDAV-enabled web site. If you're currently using Expression Web to open a web site using FPSE, the change to WebDAV should be fairly seamless. If you're currently using FrontPage, the Expression Web team has put together a whitepaper that describes the differences between FrontPage and Expression Web, which is available from the following link:

That being said, when opening a WebDAV web site in Expression Web, you simply enter the HTTP URL the same way that you would if you were opening a site using FPSE:

Expression-Web-Open-Site

When you first open a web site using WebDAV, Expression Web will prompt you whether to edit the web site live, or edit locally and publish your changes later:

Expression-Web-Edit-Live

Once your live web site is opened, the WebDAV editing experience is what you would have expected from using FPSE:

Expression-Web-Edit-Page

Summary

So in closing, I've presented a few things to consider when working with WebDAV instead of FPSE. Using the WebDAV redirector makes working with WebDAV sites as easy as working with network shares, and using Expression Web is by far the easiest way to edit WebDAV sites.

WebDAV Module for Windows Server 2008 GoLive Beta is released

Earlier today the IIS product team released the GoLive beta version of the new WebDAV extension module for IIS 7. (This version is currently available for Windows Server 2008 only.)

Listed below are the links for the download pages for each of the individual installation packages:

We've loaded this version with many great new features such as:

  • Integration with IIS 7.0: The new WebDAV extension module is fully integrated with the new IIS 7.0 administration interface and configuration store.
  • Per-site Configuration: WebDAV can be enabled at the site-level on IIS 7.0, which differed from IIS 6.0 where WebDAV was enabled at the server-level through a Web Service Extension.
  • Per-URL Security: WebDAV-specific security is implemented through WebDAV authoring rules that are configured on a per-URL basis.

Here are a couple of screenshots of the new WebDAV UI in action:

WebDAV UI WebDAV Authoring Rules

Additional documentation about installing and using this version of WebDAV can be found at the following URL:

Installing and Configuring WebDAV on IIS 7.0:
http://go.microsoft.com/fwlink/?LinkId=105146

While this release is a beta version and not technically supported, feedback about this release and requests for information can be posted to the following web forum:

IIS7 - Publishing:
http://forums.iis.net/1045.aspx

I would be remiss if I did not mention that special thanks go to:

  • Keith – for building it
  • Eok, Sriram, Ciprian – for testing it
  • Gurpreet, Brian, Reagan – for making it look pretty
  • Vijay, Will, Taylor – for helping keep everything on track ;-]

Thanks!


Note: This blog was originally posted at http://blogs.msdn.com/robert_mcmurray/