A Completely Technology-agnostic Format

Perfecting data accessibility with the index

LTFS 2 Technology: Segment 4


Now I know you had talked a lot about the importance of the index and how the index allows for the granular rollback, but it does make an interesting question regarding reliability of LTFS and, of course, reliability of the index. Can you tell me more about what you guys did architecturally to design it to insure that the data was available and reliable and also, specifically, that the index could always be accessed or was made as highly available as possible, perhaps, is the way to put it.

Sure. So LTFS as a format is completely independent of today's linear tape media and that may sound like a strange thing based on there being “LT” in the acronym and us being focused with the LTO tape format, but when we were writing the format specification, we intentionally avoided using the word "tape" apart from as an expansion of the file system name. The reason is we wanted to actually get the format description to be technology agnostic and potentially applicable to other storage media that is currently in development at various labs around the world and will be coming over the years if it proves out to be successful.

The only characteristics of the underlying medium that we require from the format specification point of view is that the media is linear and that we can partition the media into two areas that can be written to independently. So based on those characteristics, we have defined in the software implementation and in some rules in the format specification of the order of operations necessary to safely update data on LTFS volumes. In particular, this is about how you can safety update the index records. Because the index record is the key to being able to access the file essence that is stored on the data tape. If your index is unreliable, corrupted, untrustworthy, then it can point you into completely different areas of the file essence data than was originally written.

So we spent a lot of time in getting the index update semantics right. And this is the way that the index update is performed. One of the partitions that we create on the media is referred to as the index partition. This partition exists primarily to hold a copy of the current index. In addition to that copy of the current index, the end of the data partition that is the last record that is always written to the data partition is also a copy of the current index. So this means that the cartridge has a copy of the index on the index partition at the beginning of the index partition or relatively close to the beginning of the index partition, I should say. And there is a second copy on the data partition if for any reason the copy stored on the index partition is unreadable.

All of the other indexes written to the data partition from previous edits exist in place interleaved with the existing file essence locks and there is a pointer from the current index stored on the index partition to the location of the index stored on the data partition and then the current copy of the index stored on the data partition points to the next previous index on the data partition and so on down the line back to the point in time where the cartridges were completely empty. So we have two copies of the current index and the only time we ever overwrite an index is on the index partition when we already have a safe copy stored on the data partition.

In addition, the LTO cartridges include a small amount of flash storage in the order of a couple of kilobytes that is used to keep track of cartridge state according to the LTO specification. We have added a couple of additional fields, data fields, into this flash storage as part of the cartridge and those fields are used to update, are updated when we lay down indexes to each partition. The flash built into the LTO cartridge tells us which index is most current between the two potential current indexes and we can read off both indexes when there is some odd situation and compare them to determine which is the most current. In terms of data reliability, this is really comes down to the reliability of the LTO specification and the LTO cartridge and drive mechanisms. One of the key things about LTO tape technology is that the head that performs the reading and writing to the tape media is assembled somewhat like a sandwich held vertically.

So if you think of a peanut butter sandwich and you hold it up vertically so that you have the risk of peanut butter dribbling down onto your keyboard, the peanut butter would constitute the write head in an LTO drive and the bread pieces on either side constitutes the read heads. What this means is that there is always a read head that is behind the write head when the tape is moving.

When you are writing data to an LTO tape with any technology, the tape drive is always reading the data that it has just written to the tape so that it can compare what it has just read to what it was expecting to write. And at that point if there is a miscompare when the data gets laid down, the tape drive will mark that area of the tape as bad and move along and rewrite the data. So you have very firm guarantees that the data you actually intended to write to the tape actually landed on the tape. And then there's the question of reliability of the tape media itself. In general we have a pretty good understanding of the tape media itself. And we know that we can read data from tapes written in the 1960's because we do it from time to time. The difficulty of reading from those tapes is not the reliability of the data storage on the media, the difficulty is finding tape drives that are in working condition that are compatible with the cartridges. Fortunately with the LTO consortium standardizing largely tape drives and providing backwards compatibility baked into the specification, we can potentially avoid that historical problem.

Yes, I mean there's no doubt that the standardization of LTO, you bring a great point, really has sort of helped change some of that. Because I remember back in the day there were many competing tape formats which obviously used similar recording technology, but all kinds of different formats and even different tape heads and it obviously created a lot of complexity. And so wide-spread adoption and commonality of LTO has really enabled a change in how we do things and I think has also enabled LTFS to be so much more relevant because pretty much everyone has LTO these days. It is the largest market share I think of the tape formats, certainly that we see today.

Yeah, there's two parts to that old problem that you just described. One is the physical media and the tape drive itself. I completely agree that the LTO seems to have ameliorated that problem. The other problem is that of data format on the tape with proprietary tape formats and with indexes thus stored off the cartridge itself or possibly stuck to the cartridge by way of a note that is hand-written onto the cartridge saying, "File A starts at offset 10 on the tape." Which is really the way tape used to be and is actually the way that other ways of working with tape effectively still work today.

Before LTFS, your backup software or your tape management software was doing the electronic equivalent of writing out a sticky note saying, "File A starts at this offset on the cartridge and is possibly owned by this user. File B starts at this other offset." Which is why if you lose that database in your tape management software, all of your tapes are garbage because there's no way to identify where one file starts and another file ends. LTFS is really in the software space of trying to address that problem of getting the information you need to interpret what's on the tape stored on the tape itself. And standardizing the format and describing it to the world in an open document so that in 30 years from now when we have a good chance of being able to read data from the tapes mechanically, using tape drives, but we still have the concern of, well, how do we know whether the first block of the tape is a file or is it index or is it just some sort of empty space that doesn’t have any real data. That’s what the format specification is intended to achieve which is why I’m leaning heavily on LTFS the software is an implementation of the format specification but the format specification is what really brings the value to the industry in terms of interoperability and long-term data storage.

The Speakers:

Michael Richmond
Jay Livens