Files-11, also known as
on-disk structure, is the
file systemIn computing, a file system is a method for storing and organizing computer files and the data they contain to make it easy to find and access them...
used by
Hewlett-PackardHewlett-Packard Company , commonly referred to as HP, is a technology corporation headquartered in Palo Alto, California, United States. HP has its United States offices at the former old Compaq Campus in unincorporated Harris County, Texas, Latin America offices in Miami-Dade County, Florida,...
's
OpenVMSOpenVMS , previously known as VAX-11/VMS, VAX/VMS or VMS, is the name of a high-end computer server operating system that runs on VAX, Alpha and Itanium-based families of computers...
operating systemAn operating system is an interface between hardware and user which is responsible for the management and coordination of activities and the sharing of the resources of the computer that acts as a host for computing applications run on the machine. As a host, one of the purposes of an operating...
, and also (in a simpler form) by the older
RSX-11RSX-11 is a family of real-time operating systems mainly for PDP-11 computers created by Digital Equipment Corporation , common in the late 1970s and early 1980s. RSX-11D first appeared on the PDP-11/40 in 1972...
. It is a hierarchical file system, with support for
access control listWith respect to a computer filesystem, an access control list is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed to be performed on given objects. In a typical ACL, each entry in the...
s,
recordIn computer science, a record is one of the simplest data structures, consisting of two or more values or variables stored in consecutive memory positions; so that each component can be accessed by applying different offsets to the starting address.For example, a date may be stored as a record...
-oriented
I/OIn computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world – possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...
, remote
networkA computer network is a group of interconnected computers. Networks may be classified according to a wide variety of characteristics. This article provides a general overview of some types and categories and also presents the basic components of a network....
access, and file versioning.
Files-11 is similar to, but significantly more advanced than, the filesystems used in previous
Digital Equipment CorporationDigital Equipment Corporation was a pioneering American company in the computer industry. It is often referred to within the computing industry as DEC...
operating systems such as
TOPS-20The TOPS-20 operating system by Digital Equipment Corporation was the second proprietary OS for the PDP-10 mainframe computer. TOPS-20 began in 1969 as Bolt, Beranek and Newman's TENEX operating system, using special paging hardware...
and
RSTS/ERSTS is a multi-user time-sharing operating system, developed by Digital Equipment Corporation , for the PDP-11 series of 16-bit minicomputers. The first version of RSTS was implemented in 1970 by DEC software engineers that developed the TSS-8 time-sharing operating system for the PDP-8...
. It is also a clear predecessor of NTFS. Many of the concepts used in Files-11 appear in
NTFSNTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....
.
History
The native OpenVMS file system is descended from older DEC operating systems, and is similar in many ways. A major difference is the layout of directories. These file systems all provided some form of rudimentary non-hierarchical directory structure, typically based on assigning one directory per user account. Under RSTS/E, each user account was represented by two numbers, a [
project,
programmer] pair, and had one associated directory. Special system files, such as program executables and the OS itself, were stored in the directory of a reserved system account.
While this was suitable for
PDP-11The PDP-11 was a series of 16-bit minicomputers sold by Digital Equipment Corp. from 1970 into the 1990s. Though not explicitly conceived as successor to DEC's PDP-8 computer in the PDP series of computers , the PDP-11 replaced the PDP-8 in many real-time applications...
systems, which possessed limited permanent storage capacity,
VAXVAX was an instruction set architecture developed by Digital Equipment Corporation in the mid-1970s. A 32-bit complex instruction set computer ISA, it was designed to extend or replace DEC's various Programmed Data Processor ISAs...
systems with much larger hard drives required a more flexible method of file storage: hierarchical directory layout in particular, the most notable improvement in ODS-2.
Overview
"Files-11" is the general term for five separate filesystems, known as on-disk structure (ODS) levels 1 through 5.
ODS-1 is the flat filesystem used by the RSX-11 OS, supported by older
VMSOpenVMS , previously known as VAX-11/VMS, VAX/VMS or VMS, is the name of a high-end computer server operating system that runs on VAX, Alpha and Itanium-based families of computers...
systems for RSX compatibility, but never used to support VMS itself; it has been largely superseded by ODS-2 and ODS-5.
ODS-2 is the standard VMS filesystem, and remains the most common filesystem for system disks (the disk on which the operating system is installed).
Although seldom referred to by their ODS level designations,
ODS-3 and
ODS-4 are the Files-11 support for the CD-ROM
ISO 9660ISO 9660, also referred to as CDFS by some hardware and software providers, is a file system standard published by the International Organization for Standardization for optical disc media....
and
High SierraHigh Sierra Format is the early logical file system used for compact discs. The ISO 9660 standard is based on revised HSF. The HSF standard was created in October 1985 when representatives of 12 computer hardware manufacturers gathered at Del Webb's High Sierra Hotel and Casino near Lake Tahoe,...
filesystems, respectively.
ODS-5 is an extended version of ODS-2 available on
AlphaAlpha, originally known as Alpha AXP, was a 64-bit reduced instruction set computer instruction set architecture developed by Digital Equipment Corporation , designed to replace the 32-bit VAX complex instruction set computer ISA and its implementations. Alpha was implemented in microprocessors...
and IA-64 platforms which adds support for case-preserving filenames with non-
ASCIIThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
characters and improvements to the hierarchical directory support. It was originally intended for file serving to
Microsoft WindowsMicrosoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces...
or other non-VMS systems as part of the "NT affinity" project, but is also used on user disks and
InternetThe Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite to serve billions of users worldwide...
servers.
Directory layout
All files and directories in a Files-11 filesystem are contained inside one or more
parent directories, and eventually under the root directory, the
master file directory (see below). The filesystem is therefore organised in a tree-like structure.
In this example (
see right),
File 2 has a directory entry under both
Dir 2 and
Dir 3; it is "in" both directories simultaneously. Even if removed from one, it would still exist in the other directory until removed from there also. This is similar to the concept of hard links in
UNIXUnix is a computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
, although care must be taken that the file is not actually deleted on disks that are not set up for hard links (only available on ODS-5 disks, and then only if the disk has hard links enabled).
Disk organization and naming
An operational VMS system has access to one or more online disks, each of which contains a complete, independent filesystem. These are either local storage or, in the case of a cluster, storage shared with remote systems.
In an OpenVMS cluster configuration, non-private disks are shared between all nodes in the cluster (see figure 1)
. In this configuration, the two system disks are accessible to both nodes via the network, but the private disk is not shared: it is mounted for use only by a particular user or process on that machine. Access to files across a cluster is managed by the OpenVMS Distributed Lock Manager, an integral part of the filesystem.
Multiple disks can be combined to form a single large logical disk, or volume set
. Disks can also be automatically replicated into shadow sets for data security or faster read performance.
A disk is identified by either its physical name or (more often) by a user-defined logical name. For example, the boot device (system disk) may have the physical name
$3$DKA100, but it is generally referred to by the logical name
SYS$SYSDEVICE.
Filesystems on each disk (with the exception of ODS-1) are hierarchical. A fully specified
filenameA filename is a special kind of string used to uniquely identify a file stored on the file system of a computer. Some operating systems also identify directories in the same way. Different operating systems impose different restrictions on length and allowed characters on filenames...
consists of a nodename, a username and password, a device name, directory, filename, file type, and a version number, in the format:
NODE"accountname password"::device:[directory.subdirectory]filename.type;ver
For example,
[DIR1.DIR2.DIR3]FILE.EXT refers to the latest version of
FILE.EXT, on the current default disk, in directory
[DIR1.DIR2.DIR3].
DIR1 is a subdirectory of the master file directory (MFD), or root directory
, and DIR2 is a subdirectory of DIR1. A disk's MFD is identified by [000000].
Most parts of the filename can be omitted, in which case they are taken from the current default file specification. The default file specification replaces the concept of "current directory" in other operating systems by providing a set of defaults for node, device name and directory. All processes have a default file specification which includes disk name and directory, and most VMS filesystem
routineRoutine may refer to:*Routine, as a course of normative, standardized actions or procedures that are followed regularly, often repetitiously*Choreographed routine, orchestrated dance involving several performers...
s accept a default file specification which can also include the file type; the
TYPE command, for example, defaults to "
.LIS" as the file type, so the command
TYPE F, with no extension, attempts to open the file
F.LIS.
Every file has a version number, which defaults to 1 if no other versions of the same filename are present (otherwise one higher than the greatest version). Every time a file is saved, rather than overwriting the existing version, a new file with the same name but an incremented version number is created. Old versions are can be deleted explicitly, with the
DELETE or the
PURGE command, or optionally, older versions of a file can be deleted automatically when the file's version limit is reached (set by
SET FILE/VERSION_LIMIT). Old versions are thus not overwritten, but are kept on disk and may be retrieved at any time. The architectural limit on version numbers is 32767. The versioning behavior is easily overridden if it is unwanted. In particular, files which are directly updated, such as databases, do not create new versions unless explicitly programmed.
ODS-2 is limited to eight levels of subdirectories, and only uppercase, alphanumeric names (plus the underscore, dash, and dollar sign) up to 39.39 characters (39 for the filename and another 39 for the extension). ODS-5 expands the character set to lowercase letters and most other printable ASCII characters, as well as
ISOThe International Organization for Standardization , widely known as ISO , is an international-standard-setting body composed of representatives from various national standards organizations. Founded on 23 February 1947, the organization promulgates worldwide proprietary industrial and commercial...
Latin-1 and
UnicodeUnicode is a computing industry standard allowing computers to consistently represent and manipulate text expressed in most of the world's writing systems...
characters, increases the maximum filename length and allows unlimited levels of subdirectories. When constructing a pathname for an ODS-5 file which uses characters not allowed under ODS-2, a special "^" syntax is used to preserve backwards compatibility; the file "
file.tar.gz;1" on an ODS-5 disk, for example, would be referred to as "
file^.tar.gz"—the file's name is "
file.tar", and the extension is "
.gz".
File security: protection and ACLs
VMS file security is defined by two mechanisms, UIC-based access control and
ACLWith respect to a computer filesystem, an access control list is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed to be performed on given objects. In a typical ACL, each entry in the...
-based access control. UIC access control is based on the owner of the file and the UIC, or user, accessing the file. Access is determined by four groups of permissions:
And four permission bits:
- Read
- Write
- Execute
- Delete
The "system" access applies to any user whose UIC group code is less than or equal to the
SYSGEN parameter
MAXSYSGROUP (typically 8, or 10
octalThe octal numeral system, or oct for short, is the base-8 number system, and uses the digits 0 to 7. Numerals can be made from binary numerals by grouping consecutive binary digits into groups of three...
) (for example the
SYSTEM user); "owner" and "group" apply to the owner of the file and that user's user group, and "world" applies to any other user. There is also a fifth permission bit, "Control", which is used to determine access to change file metadata such as protection. This group cannot be set explicitly; it is always set for System and Owner, and never for Group or World.
UIC-based access control is also affected by four system privileges, which allow users holding them to override access controls:
- BYPASS: user implicitly has RWED access to all files, regardless of file protection;
- READALL: user implicitly has R access to all files;
- SYSPRV: user may access files based on System protection;
- GRPPRV: user may access files based on System protection if their UIC group matches the file's group.
ACLs allow additional privileges to be assigned on a user– or group–specific basis; for example, a web server's UIC could be granted read access to all files in a particular directory. ACLs can be marked as inherited, where a directory file's ACL applies to all files underneath it. ACLs are modified using the
EDIT/ACL command, and take the form of identifier/access pairs. For example, the ACL entry
(IDENTIFIER=HTTP$SERVER,ACCESS=READ+EXECUTE)
would allow the user
HTTP$SERVER to read and execute the file.
Logical names
A
logical name is a system variable which may reference a disk, directory or file, or contain other program-specific information. For example, the logical
SYS$SYSDEVICE contains the system's boot device. A logical name normally refers to a single directory or disk, e.g.
SYS$LOGIN: which is the user's login (home) directory (or directories); these logicals cannot be used as true disk names—SYS$LOGIN:[DIR]FILE is not a valid file specification. However, concealed
logical names, defined by DEFINE/TRANSLATION=CONCEALED, can be used in that way; these rooted
directories are defined with a trailing "." on the directory specification, hence
$ DEFINE/TRANS=CONCEAL HOME DISK$USERS:[username.]
would allow
HOME:[DIR]FILE to be used. More common are simple logicals which point to specific directories associated with some application software which may be located in on any disk or any directory. Hence logical ABC_EXE may point to a directory of executable programs for application ABC and ABC_TEMP may point to a directory of temporary files for that same application and this directory may be on the same disk and in the same directory tree as ABC_EXE or could be somewhere on another disk (and in a different directory tree).
Logical names do not have a close equivalent in POSIX operating systems. They resemble Unix
environment variableEnvironment variables are a set of dynamic named values that can affect the way running processes will behave on a computer.-Synopsis:In all Unix and Unix-like systems, each process has its own private set of environment variables...
s, except they are expanded by the filesystem, instead of the command shell or application program. They must be defined before use, so it is common for many logical names to be defined in the system startup command file, as well as user login command files.
The closest non-VMS operating system to support the concept of logical names is
AmigaOSAmigaOS is the default native operating system of the Amiga personal computer. It was developed first by Commodore International, and initially introduced in 1985 with the Amiga 1000...
, through the
ASSIGN command. Indeed, AmigaOS's disk operating system,
AmigaDOSAmigaDOS provides the disk operating system portion of the AmigaOS. This includes file systems, file and directory manipulation, the command-line interface, file redirection and so on....
, seems to derive much from VMS, implying that
TRIPOSTRIPOS is a computer operating system. Development started in 1976 at the Computer Laboratory of Cambridge University and it was headed by Dr. Martin Richards. The first version appeared in January 1978 and it originally ran on a PDP-11. Later it was ported to the Computer Automation LSI4 and the...
(of which AmigaDOS is a port) was itself inspired strongly from VMS. For example, physical device names follow a pattern like DF0: for the first floppy disk, CDROM2: for the 3rd CD-ROM drive, etc. However, since the system can boot from any attached drive, the operating system creates the SYS: assignment to automatically reference the boot device used. Other assignments, LIBS:, PREFS:, C:, S:, et al. are also made, themselves referenced off SYS:. Users are, of course, allowed to create and destroy their own assignments too.
Logical names may reference other logical names (up to a predefined nesting limit of 10), and may contain lists of names to search for an existing filename. Some frequently referenced logical names are:
| logical name | meaning |
| SYS$INPUT |
equivalent of standard input, program data source |
| SYS$OUTPUT |
equivalent of standard output, program data destination |
| SYS$ERROR |
equivalent of standard error Standard error can refer to:* Standard error , the estimated standard deviation or error of a series of measurements* Standard error stream, one of the standard streams in Unix-like operating systems... , program error message destination |
| SYS$COMMAND |
source of batch file (that is, .COM command file) commands |
| TT |
the terminal associated with the process |
| SYS$PRINT |
the default printer or print queue |
| SYS$LOGIN |
home directory for each user |
| SYS$SCRATCH |
temporary folder In computing, a temporary folder or temporary directory is a directory used to hold temporary files. Many operating systems and some software automatically delete the contents of this directory at bootup or at regular intervals.... , directory for temporary files |
| SYS$SYSTEM |
directory containing most system programs and a few vital data files, such as the system authorization file (accounts and passwords) |
| SYS$SHARE |
shared runtime libraries, executables, etc. |
| SYS$LIBRARY |
system and added libraries |
Record-oriented I/O: Record Management Services
Record Management ServicesRecord Management Services are procedures in the VMS, RSTS/E, RT-11 and high-end RSX-11 operating systems that programs may call to process files and records within files. VMS RMS is an integral part of the system software; its procedures run in executive mode...
is the structured
I/OIn computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world – possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...
layer of the VMS operating system. RMS provides comprehensive program support for managing structured
fileA computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
s, such as record-based and indexed
databaseA database is an integrated collection of logically related records or files consolidated into a common pool that provides data for one or more multiple uses....
files. The VMS filesystem, in conjunction with RMS, extends files access past simple
byteA byte is a unit of information storage representing the smallest addressable element for a given computer architecture. It often designates a sequence of bits whose length is determined by the architecture...
-streams and allows OS-level support for a variety of rich files types. Each file in the VMS filesystem may be thought of as a
databaseA database is an integrated collection of logically related records or files consolidated into a common pool that provides data for one or more multiple uses....
, containing a series of
recordsIn computer science, a record is one of the simplest data structures, consisting of two or more values or variables stored in consecutive memory positions; so that each component can be accessed by applying different offsets to the starting address.For example, a date may be stored as a record...
, each of which has one of more individual
fieldsIn computer science, data that has several parts can be divided into fields. For example, a computer may represent today's date as three distinct fields: the day, the month and the year....
. A text file, for example, is a list of records (lines) separated by a newline character. RMS is an example of a
record-oriented filesystemIn computer science, a record-oriented filesystem is a file system where files are stored as a collection of records. There are several different record formats: fixed-length or variable length, and different physical organizations or padding mechanisms, metadata is associated with the file records...
.
There are four
record formats defined by RMS:
- Fixed length - all records in the file have the same length.
- Variable length - records vary in length, and every record is prefixed by a count byte giving its length.
- Variable record length with fixed-length control - records vary in length, but are preceded by a fixed-length control block.
- Stream - record vary in length, and every record is separated from the next one by a termination character. A text file is an example of a stream-format file using line feed or carriage return
Originally, carriage return was the term for the control character in Baudot code on a teletypewriter for end of line return to beginning of line and did not include line feed...
to separate records.
There are four
record access methods, or methods to retrieve extant records from files:
- Sequential Access - starting with a particular records, subsequent records are retrieved in order until the end of the file.
- Relative Record Number Access - records are retrieved via a record number relative to the beginning of the file.
- Record File Address Access - records are retrieved directly by their location in the file (RFA, or Record File Address).
- Indexed Access - records are retrieved via a key, in a form of key-value mapping
An associative array is an abstract data type composed of a collection of unique keys and a collection of values, where each key is associated with one value...
.
Physical layout: the On-Disk Structure
At the disk level, ODS represents the filesystem as an array of blocks
, a block being 512 contiguous bytes on one physical disk (volume
). Disk blocks are assigned in clusters
(originally 3 contiguous blocks but later increased with larger disk sizes). A file on the disk will ideally be entirely contiguous, i.e. the blocks which contain the file will be sequential, but disk fragmentation will sometimes require the file to located in discontiguous clusters in which case the fragments are called 'extents'. Disks may be combined with other disks to form a volume set
and files stored anywhere across that set of disks but larger disk sizes have reduced the use of volume sets because management of a single physical disk is simpler.
Every file on a Files-11 disk (or volume set) has a unique file identification
(FID), composed of three numbers: the file number
(NUM), the file sequence number
(SEQ), and the relative volume number (RVN). The NUM indicates where in the
INDEXF.SYS file (see below) the metadata for the file is located; the SEQ is a generation number which incremented when the file is deleted and another file is created reusing the same INDEXF.SYS entry (so any dangling references to the old file do not accidentally point to the new one); and the RVN indicates the volume number on which the file is stored when using a volume set.
Directories
The structural support of an ODS volume is provided by a directory file
—a special file containing a list of file names, file version numbers and their associated FIDs, similar to VSAM catalogs on MVSMultiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers...
. At the root of the directory structure is the master file directory
(MFD), the root directory which contains (directly or indirectly) every file on the volume.
This diagram shows an example directory containing 3 files, and the way each filename is mapped to the
INDEXF.SYS entry (each INDEXF entry contains more information; only the first few items are shown here).
. On ODS-2 and later volumes, the layout of directories under the MFD is free-form, subject to a limit on the nesting of directories (8 levels on ODS-2 and unlimited on ODS-5). On multi-volume sets, the MFD is always stored on the first volume, and contains the subdirectories of all volumes.
Note that the filesystem implementation itself does not refer to these files by name, but by their file IDs, which always have the same values. Thus, INDEXF.SYS is always the file with NUM = 1 and SEQ = 1.
The index file contains the most basic information about a Files-11 volume set.
There are two organizations of INDEXF.SYS, the traditional organization and the organization used on disks with GPT.SYS; with the GUID Partition Table (GPT) structures.
secondary home blocks, to allow recovery of the volume if it is lost or damaged.
On disks with GPT.SYS, GPT.SYS contains the equivalent of the boot block (known as the Master Boot Record (MBR)), and there is no primary home block. All home blocks present on a GPT-based disk are alternate home blocks. These structures are not included in INDEXF.SYS, and the blocks of the INDEXF.SYS file are unused.