8.3
Encyclopedia
An 8.3 filename is a filename
Filename
The filename is metadata about a file; a string used to uniquely identify a file stored on the file system. Different file systems impose different restrictions on length and allowed characters on filenames.A filename includes one or more of these components:...

 convention used by old versions of DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

, versions of Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 prior to Windows 95
Windows 95
Windows 95 is a consumer-oriented graphical user interface-based operating system. It was released on August 24, 1995 by Microsoft, and was a significant progression from the company's previous Windows products...

, and Windows NT 3.51
Windows NT 3.51
Windows NT 3.51 is the third release of Microsoft's Windows NT line of operating systems. It was released on 30 May 1995, nine months after Windows NT 3.5. The release provided two notable feature improvements; firstly NT 3.51 was the first of a short-lived outing of Microsoft Windows on the...

. It is also used in modern Microsoft operating systems as an alternate filename to the long filename
Long filename
Long filenames , are Microsoft's way of implementing filenames longer than the 8.3 filename, or short-filename, naming scheme used in Microsoft DOS in their modern FAT and NTFS filesystems. Because these filenames can be longer than an 8.3 filename, they can be more descriptive...

 for compatibility with legacy
Legacy system
A legacy system is an old method, technology, computer system, or application program that continues to be used, typically because it still functions for the users' needs, even though newer technology or more efficient methods of performing a task are now available...

 programs. The filename convention is limited by the FAT
File Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...

 file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

. Similar 8.3 file naming schemes have also existed on earlier CP/M
CP/M
CP/M was a mass-market operating system created for Intel 8080/85 based microcomputers by Gary Kildall of Digital Research, Inc...

, Atari
Atari DOS
Atari DOS is the disk operating system used with the Atari 8-bit family of computers. Operating system extensions loaded into memory were required in order for an Atari computer to access a disk drive. These extensions to the operating system added the disk handler and other file management...

, and some Data General
Data General
Data General was one of the first minicomputer firms from the late 1960s. Three of the four founders were former employees of Digital Equipment Corporation. Their first product, the Data General Nova, was a 16-bit minicomputer...

 and Digital Equipment Corporation
Digital Equipment Corporation
Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

 minicomputer
Minicomputer
A minicomputer is a class of multi-user computers that lies in the middle range of the computing spectrum, in between the largest multi-user systems and the smallest single-user systems...

 operating systems.

Overview

8.3 filenames have at most eight characters, optionally followed by a period
Full stop
A full stop is the punctuation mark commonly placed at the end of sentences. In American English, the term used for this punctuation is period. In the 21st century, it is often also called a dot by young people...

 "." and a filename extension
Filename extension
A filename extension is a suffix to the name of a computer file applied to indicate the encoding of its contents or usage....

 of at most three characters. For files with no extension, the "." if present has no significance (that is "myfile" and "myfile." are equivalent). File and directory
Directory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...

 names are uppercase, although systems that use the 8.3 standard are usually case-insensitive.

VFAT, a variant of FAT with an extended directory format, was introduced in Windows 95
Windows 95
Windows 95 is a consumer-oriented graphical user interface-based operating system. It was released on August 24, 1995 by Microsoft, and was a significant progression from the company's previous Windows products...

 and Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...

 3.5. It allowed mixed-case Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 long filename
Long filename
Long filenames , are Microsoft's way of implementing filenames longer than the 8.3 filename, or short-filename, naming scheme used in Microsoft DOS in their modern FAT and NTFS filesystems. Because these filenames can be longer than an 8.3 filename, they can be more descriptive...

s (LFNs) in addition to classic 8.3 names.

To maintain backward-compatibility with legacy applications (on DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

 and Windows 3.1), an 8.3 filename is automatically generated for every LFN, through which the file can still be renamed, deleted or opened. The 8.3 filename can be obtained using the Kernel32.dll function GetShortPathName.

Although there is no compulsory algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 for creating the 8.3 name from an LFN, Windows uses the following convention:
  1. If the LFN is 8.3 uppercase, no LFN will be stored on disk at all.
    • Example: "TEXTFILE.TXT"
  2. If the LFN is 8.3 mixed case, the LFN will store the mixed-case name, while the 8.3 name will be an uppercased version of it.
    • Example: "TextFile.Txt" becomes "TEXTFILE.TXT".
  3. If the filename contains characters not allowed in an 8.3 name (including space which was disallowed by convention though not by the APIs) or either part is too long, the name is stripped of invalid characters such as spaces and extra periods. Other characters such as "+"
    Plus and minus signs
    The plus and minus signs are mathematical symbols used to represent the notions of positive and negative as well as the operations of addition and subtraction. Their use has been extended to many other meanings, more or less analogous...

     are changed to the underscore
    Underscore
    The underscore [ _ ] is a character that originally appeared on the typewriter and was primarily used to underline words...

     "_", and uppercased. The stripped name is then truncated to the first 6 letters of its basename
    Basename
    basename is a standard UNIX computer program. When basename is given a pathname, it will delete any prefix up to the last slash character and return the result...

    , followed by a tilde
    Tilde
    The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics....

    , followed by a single digit
    Numerical digit
    A digit is a symbol used in combinations to represent numbers in positional numeral systems. The name "digit" comes from the fact that the 10 digits of the hands correspond to the 10 symbols of the common base 10 number system, i.e...

    , followed by a period ".", followed by the first 3 characters of the extension.
    • Example: "TextFile1.Mine.txt" becomes "TEXTFI~1.TXT" (or "TEXTFI~2.TXT", should "TEXTFI~1.TXT" already exist). "ver +1.2.text" becomes "VER_12~1.TEX".
  4. Beginning with Windows 2000, if at least 4 files or folders already exist with the same initial 6 characters in their short names, the stripped LFN is instead truncated to the first 2 letters of the basename (or 1 if the basename has only 1 letter), followed by 4 hexadecimal
    Hexadecimal
    In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...

     digits derived from an undocumented hash of the filename, followed by a tilde, followed by a single digit, followed by a period ".", followed by the first 3 characters of the extension.
    • Example: "TextFile.Mine.txt" becomes "TE021F~1.TXT".


NTFS
NTFS
NTFS is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7....

, a file system used by the Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...

 family, supports LFNs natively, but 8.3 names are still available for legacy applications. This can be optionally disabled to increase performance in situations where large numbers of similarly named files exist in the same folder.

The ISO 9660
ISO 9660
ISO 9660, also referred to as CDFS by some hardware and software providers, is a file system standard published by the International Organization for Standardization for optical disc media....

 file system (mainly used on compact disc
Compact Disc
The Compact Disc is an optical disc used to store digital data. It was originally developed to store and playback sound recordings exclusively, but later expanded to encompass data storage , write-once audio and data storage , rewritable media , Video Compact Discs , Super Video Compact Discs ,...

s) has similar limitations at the most basic Level 1, with the additional restriction that directory names cannot contain extensions and that some characters (notably hyphen
Hyphen
The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation. The hyphen should not be confused with dashes , which are longer and have different uses, or with the minus sign which is also longer...

s) are not allowed in filenames. Level 2 allows filenames of up to 31 characters, more compatible with Mac OS
Mac OS
Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...

 filenames.

During the Microsoft antitrust trials
United States v. Microsoft
United States v. Microsoft was a set of civil actions filed against Microsoft Corporation pursuant to the Sherman Antitrust Act of 1890 Section 1 and 2 on May 8, 1998 by the United States Department of Justice and 20 U.S. states. Joel I. Klein was the lead prosecutor...

, the names MICROS~1 and MICROS~2 were humorously used to refer to the companies that might exist after a proposed split of Microsoft.

Compatibility

This legacy technology is used in a wide range of products and devices, as a standard for interchanging information, such as compact flash cards used in cameras. VFAT LFN Long filenames introduced by Windows 95/98/ME retained compatibility. But the VFAT LFN used on NT-based systems (Windows NT/2K/XP) uses a modified 8.3 shortname.

If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support this. This creates a backwards-compatibility problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.

Directory table

A directory table is a special type of file that represents a directory. Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive
Archive bit
The archive bit is a file attribute used by Microsoft operating systems and by OS/2. Typically it's state indicates whether or not the file has been backed up....

, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory.

Legal characters for DOS filenames include the following:
  • Upper case letters AZ
  • Numbers 09
  • Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the filename, also filenames with spaces in them could not be used on the DOS command line because it lacked a suitable escaping system)
  • ! # $ % & ' - @ ^ _ ` { } ~
  • (FAT-32 only) + , . ; = [ ]
  • Values 128–255


This excludes the following ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 characters:
  • " * / : < > ? \ |
    Windows/MSDOS has no shell escape character
  • Lower case letters az
    stored as AZ on FAT-12/16
  • Control characters 0–31
  • Value 127 (DEL)


Code 0xE5 as the first byte (see below) makes troubles when Cyrillic KOI8 encoding is used, because it corresponds to Cyrillic capital letter "Е". Some operating systems such as ANDOS
ANDOS
ANDOS is a Russian operating system for Electronika BK-0010, Electronika BK-0011 and Electronika BK-0011M series computers. It was created in 1990 and first released in 1992. Initially it was developed by Alexey Nadezhin and later also by Sergey Kamnev, who joined the project...

 used to automatically change the letter to the similar-looking Latin one.

The DOS filenames are in the OEM character set
Code page
Code page is another term for character encoding. It consists of a table of values that describes the character set for a particular language. The term code page originated from IBM's EBCDIC-based mainframe systems, but many vendors use this term including Microsoft, SAP, and Oracle Corporation...

.

Directory entries, both in the Root Directory Region and in subdirectories, are of the following format:
Byte Offset Length Description
0x00 8 DOS filename (padded with spaces)

The first byte can have the following special values:
0x00 Entry is available and no subsequent entry is in use
0x05 Initial character is actually 0xE5
0x2E 'Dot' entry; either '.' or '..'
0xE5 Entry has been previously erased and is not available. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x08 3 DOS file extension (padded with spaces)
0x0b 1 File Attributes
The first byte can have the following special values:
Bit Mask Description
0 0x01 Read Only
1 0x02 Hidden
2 0x04 System
3 0x08 Volume Label
4 0x10 Subdirectory
5 0x20 Archive
Archive bit
The archive bit is a file attribute used by Microsoft operating systems and by OS/2. Typically it's state indicates whether or not the file has been backed up....

6 0x40 Device (internal use only, never found on disk)
7 0x80 Unused

An attribute value of 0x0F is used to designate a long filename entry.
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following bitmap:
Bits Description
15-11 Hours (0-23)
10-5 Minutes (0-59)
4-0 Seconds/2 (0-29)

Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
15-9 Year (0 = 1980, 127 = 2107)
8-5 Month (1 = January, 12 = December)
4-0 Day (1 - 31)
0x12 2 Last access date; see offset 0x10 for description.
0x14 2 EA-Index (used by OS/2
OS/2
OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...

 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x16 2 Last modified time; see offset 0x0e for description.
0x18 2 Last modified date; see offset 0x10 for description.
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32.
0x1c 4 File size

How to convert a long filename to a short filename

Sometimes it may be desirable to convert a long filename to a short filename, for example when working with the command prompt. A few simple rules can be followed to attain the correct 8.3 filename.

1. A SFN filename can have at most 8 characters before the dot. If it has more than that, you should write the first 6, then put a tilde ~ as the seventh character and a number (usually 1) as the eighth. The number distinguishes it from other files with both the same first six letters and the same extension.

2. Dots are important and must be used even for folder names (if there is a dot in the folder name). If there are multiple dots in the long file/directory name, only the last one is used. The preceding dots should be ignored. If there are more characters than three after the final dot, only the first three are used.

3. Generally:
  • Any spaces in the filenames should be ignored when converting to SFN.
  • Ignore all dots except the last one. Leave out the other dots, just like the spaces.
  • Commas and square brackets are changed to underscores.
  • Case is not important, upper case and lower case characters are treated equally.


To find out for sure the SFN or 8.3 names of the files in a directory

use: "dir /x" shows the short names if there is one, and the long names.

or : "dir /-n" shows only the short names, in the original DIR listing format.

See also

  • Long filename
    Long filename
    Long filenames , are Microsoft's way of implementing filenames longer than the 8.3 filename, or short-filename, naming scheme used in Microsoft DOS in their modern FAT and NTFS filesystems. Because these filenames can be longer than an 8.3 filename, they can be more descriptive...

  • File Allocation Table
    File Allocation Table
    File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...

     - FAT
  • File system
    File system
    A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

  • filename extension
    Filename extension
    A filename extension is a suffix to the name of a computer file applied to indicate the encoding of its contents or usage....

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK