POST OF THE DAY
Network Appliance, Inc.
The Uniqueness of WAFL

Format for Printing

Format for printing

Request Reprints

Reuse/Reprint

By KingPhilip
November 26, 2001

Posts selected for this feature rarely stand alone. They are usually a part of an ongoing thread, and are out of context when presented here. The material should be read in that light. How are these posts selected? Click here to find out and nominate a post yourself!

In response to the comment from an earlier post:

I don't think it is possible to overstate the uniqueness of WAFL

TeefersUK asks:

Then please do. Please tell me the uniqueness of WAFL. Please tell me the features it offers that no other file system does. Please.

WARNING This post ballooned way beyond its original intent, but you asked...

That's a tall order, because there's so much to say. The best starting point is NetApp's own white papers for a detailed discussion. They are somewhat dated, e.g. one calls a file system capacity of tens of gigabytes "large," but still relevant. But prior to including the links, here's a basic summary:

1. WAFL is fast
Most file systems require 2 head seeks per write operation, because the file content resides in an area on the disk separate from the file metadata. Thus, in a traditional file system, in order to update a file, the heads must seek to the metadata area to update things like the "time modified" and the pointers the data blocks, then the head must seek to the data area and write the actual content. With WAFL, the file's metadata is simply written in a small file in the same stripe as the file's data. One head seek per write. Advantage: WAFL

In addition, the filer aggregates multiple write requests in its non-volatile transaction log, acknowledging the request as they are committed to the log, providing very fast response to the client. The log is periodically flushed to disks as described above, yielding multiple file writes per single head seek. Advantage: filer

2. WAFL includes parity RAID protection with no performance penalty normally associated with parity RAID
There are a few RAID configurations in use today. The most widely used are RAID 0, 1, 0+1, and a couple of variants of parity RAID. RAID 1 simply mirrors the data written to a drive to a second drive. If one drive fails, the other continues service. RAID 0 stripes the data across many drives, and provides performance and capacity superior to that of a single drive, but there is no redundancy, so the loss of a drive causes the loss of all of the data in the drive set. This is clearly not appropriate for the majority of enterprise class applications. For this reason, a popular approach is to mirror this striped set of drives. In other words, for each set of drives, install another identical set, and write/read to/from both sets in parallel. This is called RAID 0+1. It yields the speed benefits of RAID 0, on writes, and even better performance on reads, since the first set to come up with the requested data delivers it. The downside to RAID 1 and 0+1 is cost, both is terms of procurement, footprint, power consumption and required cooling capacity.

Then there is parity RAID, which writes data to a set of drives in a manner similar to RAID 0, but includes redundant information, parity, calculated on the basis of the data in the stripe. If a drive in the set fails, the data now missing can be calculated based on the data on the remaining drives. Some parity RAID design concentrates all of the redundant information on a single drive of the set. Others, most notably RAID 5, staggers the parity information across all of the drive in the set. The level of protection is the same for all parity RAID designs, as is the capacity overhead.

By far the most common implementation of parity RAID is RAID 5, by means of a hardware RAID controller. Unfortunately, the RAID controller, although capable of generating parity information very quickly, is burdened with inserting the parity information onto the drives, interleaving it with the actual data to be written. The RAID controller knows nothing of the file system layout, so the insertion of the parity data interferes with the disk layout as determined by the file system. This causes even more head seeks, and performance suffers. The RAID controller is rather complicated, and has historically had an annoying tendency to fail more often than other solid-state components. This sort of defeats the purpose of RAID, but it's better than not using RAID.

WAFL includes parity RAID in the file system itself. It is calculated by the filer's CPU and included in the same stripe as the rest of the data. Net result, 1 seek writes it all. It provides the speed of RAID 0 without the overhead of RAID 1 or 0+1. The expensive and flaky hardware RAID controller is also eliminated. Advantage: WAFL

3. WAFL grows on-the-fly, with no performance degradation
Historically, to grow a parity RAID based file system, one had to back up all of the data to some other storage medium, reconfigure the RAID controller with more or larger capacity drives, then restore the data. This scheme has cost many system admins innumerable weekends and, what's worse, imposes a period where the data is unavailable, usually several hours in duration. That's clearly unacceptable in today's global business. So, some vendors have devised a method whereby the RAID set can grow on the fly. This is great for availability, but there are two serious flaws. First, the parity must be reconstructed and the data shuffled around on the drives, and that imposes degraded performance for an extended period. Worse is the fact that during this shuffle, the RAID set is vulnerable. Should a drive fail before the reallocation completes, get out your backup tapes. (You did do a full backup just before you started, right?)

WAFL has none of these problems. Hot plug in drives, and assign them to a volume while your applications are running. Your file system capacity grows instantaneously. No downtime, no degraded performance. Everyone is happy: system admins, the users, their management and the company's customers. Advantage: WAFL

4. WAFL restarts quickly due to its journaling design
Each time the stripe is written to the drives, the drives contain a consistent image. If ever the filer should go down, there is no need to perform a consistency check a-la chkdsk. Some other file systems provide this feature, but they typically come from third party vendors, are expensive, and can be tricky to manage. Make sure you have the right operating system and disk firmware patches installed before venturing forth. Also, make sure the patches you install don't conflict with each other or any of the application programs you are running. It gets to be messy to manage. For the majority of systems, however, a power failure will cause a very length consistency check when power returns. During this time, you don't have access to your data.

With WAFL, a consistency check isn't required, as every write to disk makes the file system completely consistent to that point in time.

5. WAFL supports Unicode
Unicode is a means by which the system uses multiple bytes per character in the metadata section. It is thus possible to create filenames using characters that extend beyond the 26 letters in the English alphabet. Not a big deal north of the Rio Grande, but much appreciated once you install a system overseas.

6. WAFL supports both Unix and Windows permissions simultaneously, in the same file system.
This is a very cool feature. It enables single instance file sharing across operating systems. For example, a Unix application generates some data from an engineering application. The Windows desktop can then read the same file, put it into an Excel spreadsheet, and generate the business graphics the boss wants to see. This can't be done in a SAN environment without first copying the file with the data from one environment to the other. This may be OK in the casual instance, but when there are many files to copy, and the source files might be changing over time, you need to manage the copy process. You also need additional disk space for the replica. The process can get messy. Why not just read *the* file. You can, with NAS, and WAFL provides the permissions to provide native Unix and Windows support. A SAN enables sharing of disks, and helps the backup process. A NAS enables the sharing of information, as well as the bit about the backups.

7. WAFL has native support for usage quotas
Usually, if corporate polity dictates the imposition of disk quotas, you need to buy a third party quota manager, introducing another ball to keep in the air, another layer of complexity for your server to handle. Not so with WAFL, it's built in. Just turn it on. Quotas can be set on a user basis, by group, or by directory size. Turn them on or off, or modify their definitions on the fly.

8. WAFL has Snapshots
This feature is probably the most coveted of all. Their efficiency of disk usage and ease of use, not to mention zero impact on performance is unparalleled. A Snapshot is a point-in-time image of a volume, and it takes only a few seconds to create: typically 1 to 5 seconds, regardless of the size of the volume, or the level of activity on the filer. Even after you've seen it happen, you'll be shaking your head in amazement. So you overwrote your file, or inadvertently deleted it? No problem. Just go to the Snapshot subdirectory, every folder has one, find the file as it was an hour or two ago, or maybe a day or two ago, and copy it back. WAFL supports up to 31 unique point in time images of the file system. Usually, the schedule is set up to enable a user to go back a week or two. The overhead is only the changed blocks, so 15% to 20% reserve is usually generous. The snapshot reserve can be shifted up or down on the fly, and it happens instantaneously.

Other file systems that implement block level Snapshot support suffer from a major drawback. When they need to modify a file that is "captured" in a snapshot, it first needs to copy the blocks it is about to modify to a pre-configured and pre-allocated area. After the block has been copied, only then can the new content be deposited as needed. The result, two blocks need to be written, in different places, for each disk block that needs to be modified. Major performance impact. Not so with WAFL.

A big benefit of the snapshot is the ability to perform consistent backups from the filer while applications are running. There is no need to worry about files changing as they are being copied to tape; a sure means of creating an inconsistent backup. The snapshot is read-only, and completely static. The result: completely consistent backups, every time.

9. WAFL enables instantaneous recovery of entire file systems
So your system picked up the virus du-jour, and it has permeated the file system. Select a Snapshot from before the infection, and promote it the current file system. The effect is like going back in time. After the restore operation, anything that happened on the file system after the selected snapshot was taken is gone. It takes a few seconds to complete, regardless of the size of the file system.

10. WAFL enables efficient replication
WAFL can replicate a volume to a remote site. It does so by keeping track of the state of affairs at the remote site, and then changing the disk blocks that have changed on demand. This is known a asynchronous mirroring. This implies that your local applications run at full speed, without having to wait for each write to be committed at the remote end. It also means that there is no distance limitation between the source and the target systems, and that you can make do with less expensive, lower bandwidth connections. Zero performance degradation is a big advantage. Continuation of service in the event of a failure of the inter-system connection is also a plus.

Some shops require synchronous replication. It imposes a performance cost on two fronts: high volume write activity requires a high bandwidth connection (expensive), and the farther away the target system is located, the slower your overall system becomes due to the propagation delay across a longer connection. Another disadvantage is that in the event of a loss of the connection, synchronous replication (and the application that requires it) comes to a halt. WAFL cannot do synchronous mirroring, but many shops are happy with a short time lag between the active file system and the mirror.

All of these features in a simple, easy to manage package makes for a unique offering. So, as promised, here are links to three useful white papers. I'd read them in the order listed, but the real meat of WAFL design is described in 3002.

http://www.netapp.com/tech_library/3014.html
http://www.netapp.com/tech_library/3002.html
http://www.netapp.com/tech_library/3001.html

"How does this guy know all this stuff?" you may ask. I've worked with NetApp filers for about 4 years. I'm a propeller head, and like to understand how things work, so I've read most of the white papers and other publications relating to the product. Do I like them? You bet I do. I'm also very long on the stock, to a great extent for the reasons listed above.

Philip

__________________

TMF Money Advisor
Got money questions? Your answers are just a phone call away! TMF Money Advisor puts you in touch with an objective Financial Planner whenever you need it.