A Solid Foundation for Endless Possibilities
Providing software that blurs boundaries between tapes
LTFS 3 Future: Segment 2
I think the idea of using LTFS as a more traditional storage medium is an interesting one. As I envision it though, the area that I think that the spec might be able to be extended, to make that use case even more effective, is allowing it to present multiple tapes as one virtual volume. Because in the traditional LTFS spec, as we talked about in the technology section, each tape is essentially a different volume. Would you anticipate that as sort of being a future, perhaps, development area of the spec, or is that maybe something individual vendors may develop? How do you envision that happening over time?
That's an interesting question, Jay. There are certainly some existing development work that has been done by vendors, including IBM, to build library focused implementations of LTFS, the software to attain exactly the goal that you're talking about. The goal of utilizing the LTFS format as a self-describing format within a single cartridge, but providing software that blurs the boundaries between tapes, so that from a user perspective, you don't need to think about individual tapes. You can just store your files in the appropriate storage device that shows up on your desktop.
This is work that's been done, as I say, by a couple of vendors, specifically I'm familiar with IBM's work where we implemented LTFS library edition, which supports exactly this mode of operation. Ongoing development at IBM has been building out an enterprise edition to support much larger scale libraries. That library support was developed as a proprietary offering utilizing the common LTFS format, as a way for IBM to add value and bring products to market that make IBM shareholders happy, hopefully.
In terms of integration with the LTFS specification, they're selling the potential to extend the specification in ways that will define the right frameworks to go into LTFS, so that different vendors can plug their library support in to the standard software. But largely this is a software implementation issue, rather than an LTFS format specification issue, because there's nothing about the format and the data layout on an individual tape that needs to change to support library functionality. The specification itself is probably not going to change, but then maybe optional extensions to the specification that describe APIs and implementation-level details that are specific to libraries.
In an earlier podcast, I mentioned that we had built in some version numbers into the specification document, and described how the version number will change depending of the types of changes in future LTFS specification documents. Part of the motivation for introducing this versioning was that we had encountered a single instance of a data item that we had not included in an earlier version of the specification that we needed to include to allow easy identification of a file within multiple cartridges.
We had previously just calculated this value when a cartridge was mounted. When we started doing the library development, we realized that this previously dynamically calculated value needed to be persisted as part of the on-tape format, so that the value would be persistent across cartridge mounts. During that phase we spent a lot of time working through the possible data requirements for both small and large scale libraries and made sure that we baked the necessary values into the format spec at the time.
You got me really thinking with that answer. One of the things that came to mind, as you went through it, is the acknowledgment that perhaps the LTFS spec, as we know it, is really robust and mature as it is. That perhaps the other innovations that might be thought of, such as a way to pool multiple cartridges into one virtual volume, or perhaps the way to support robotics, maybe those things are then functions that are developed by third parties. That the core underlying how we write to tape use LTFS or that other stuff that people might want to create, perhaps those are things that we rely upon other vendors for, maybe doesn't come in the LTFS spec, because the LTFS spec is sort of a lower level thing. If you think of the OSI network layer kind of idea, that LTFS is a very low level thing. That those other features that people might want, might come in a different layer, such that it supports LTFS on the bottom, but the other sort of different features may come elsewhere. Is that what you're thinking?
That is exactly correct. This really goes back to something that we talked about in an earlier podcast about LTFS being two things; a format specification, and a software implementation that conforms to that format specification. The format specification describes how the data is laid out on the tape, and how that data should be changed to add files or delete files from the cartridge. The format specification does not dictate how those changes are implemented in the software. We have an open-source implementation of LTFS that's freely available, which is one possible embodiment of software that can manipulate this on-tape format.
You're exactly right that future innovations and extensions to the behavior of LTFS can be embodied in either an extension of the existing software implementation, or in new software implementations that may not even exist yet. The format specification largely is complete, at least it's complete as we could envision it by reasoning about the possible future demands. There have been some recent investigation by the LTO tape working group that's working on maintaining the LTFS specification, to add some small extensions that provide mechanisms to span a file across multiple tapes. This is really a naming convention that should be used when a file would not fit on a single cartridge.
Then I was reading about some extra extended attribute storage that would allow Unix permissions to be recorded on the LTFS cartridge beside the existing LTFS permission systems. In developing the spec, we explicitly built in the notion of extended attributes into LTFS. An extended attribute is a key value pair, where the value is arbitrary data up to 4 kilobytes in size, and the name is largely arbitrary. These key value pairs can be attached to any folder or file in the LTFS volume.
We have marked out, or we have reserved an extended attribute name that starts with, any name that starts with LTFS is reserved for use by the LTFS specification going forward. This allows us to introduce into the specification new data fields into the file system, without having to necessarily modify the specification and break backwards compatibility. Also users can add these extended attributes to any file that they store, which means that a system that reads and writes LTFS volumes could potentially use these extended attributes to store any kind of metadata about the files that they choose to record.
You can image that a backup system might tag every file for a given back up with the date and the time that the backup was created, and maybe the owner, the email address of the owner of the backup. These things could be tagged as extended attributes on all of the files that are written during a particular backup session. There are certainly other ways that extended attributes could be used that I can't envision, but would be helpful for system integrators and software developers who are trying to enhance the functionality they can build on top of LTFS, but not requiring the underlying format to be modified.