PaX
Encyclopedia
PaX is a patch
Patch (computing)
A patch is a piece of software designed to fix problems with, or update a computer program or its supporting data. This includes fixing security vulnerabilities and other bugs, and improving the usability or performance...

 for the Linux kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

 that implements least privilege protections for memory pages. The least-privilege approach allows computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

s to do only what they have to do in order to be able to execute properly, and nothing more. PaX was first released in 2000.

PaX flags data memory as non-executable, program memory as non-writable and randomly arranges the program memory. This effectively prevents many security exploits, such as some kinds of buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

s. The former prevents direct code execution
Shellcode
In computer security, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine. Shellcode is commonly written in...

 absolutely, while the latter makes so-called return-to-libc
Return-to-libc attack
A return-to-libc attack is a computer security attack usually starting with a buffer overflow in which the return address on the stack is replaced by the address of another instruction and an additional portion of the stack is overwritten to provide arguments to this function...

 (ret2libc) attacks difficult to exploit, relying on luck to succeed, but doesn't prevent variables and pointers overwriting.

PaX is maintained by The PaX Team, whose principal coder is anonymous.

Significance

Many, and perhaps even most, computer insecurities are due to errors in programs that make it possible to alter the function of the program, effectively allowing a program to be "rewritten" while running. The first 44 Ubuntu Security Notices can be categorized to show that 41% of vulnerabilities stem from buffer overflows, 11% from integer overflow
Integer overflow
In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is too large to be represented within the available storage space. For instance, adding 1 to the largest value that can be represented constitutes an integer overflow...

s, and 16% from other bad handling of malformed data. These types of bugs often open the possibility to inject and execute foreign code, or execute existing code out of order, and make up 61% of the sample group, discarding overlap. This analysis is very crude; a more comprehensive analysis of individual vulnerabilities would be likely to give very different numbers.

Many worms
Computer worm
A computer worm is a self-replicating malware computer program, which uses a computer network to send copies of itself to other nodes and it may do so without any user intervention. This is due to security shortcomings on the target computer. Unlike a computer virus, it does not need to attach...

, viruses
Computer virus
A computer virus is a computer program that can replicate itself and spread from one computer to another. The term "virus" is also commonly but erroneously used to refer to other types of malware, including but not limited to adware and spyware programs that do not have the reproductive ability...

, and attempts to take over a machine rely on changing the contents of memory so that the malware
Malware
Malware, short for malicious software, consists of programming that is designed to disrupt or deny operation, gather information that leads to loss of privacy or exploitation, or gain unauthorized access to system resources, or that otherwise exhibits abusive behavior...

 code is executed; or on executing "data" contents by misdirection. If execution of such malware could be blocked, it could do little or even no damage even after being installed on a computer; many, such as the Sasser worm, could be prevented from being installed at all.

PaX was designed to do just that for a large number of possible attacks, and to do so in a very generally applicable way. It prevents execution of improper code by controlling access to memory (read, write, or execute access; or combinations thereof) and is designed to do so without interfering with execution of proper code. At the cost of a small amount of overhead, PaX reduces many security exploits to a denial of service (DoS) or a remote code-flow control; exploits which would normally give attackers root access, allow access to important information on a hard drive, or cause other damage that will instead cause the affected program or process to crash with little effect on the rest of the system.

A DoS attack (or its equivalent) is generally an annoyance, and may in some situations cause loss of time or resources (e.g. lost sales for a business
Business
A business is an organization engaged in the trade of goods, services, or both to consumers. Businesses are predominant in capitalist economies, where most of them are privately owned and administered to earn profit to increase the wealth of their owners. Businesses may also be not-for-profit...

 whose website is affected); however, no data should be compromised when PaX intervenes, as no information will be improperly copied elsewhere. Nevertheless, the equivalent of a DoS attack is in some environments unacceptable; some businesses have level of service contracts
Service Level Agreement
A service-level agreement is a part of a service contract where the level of service is formally defined. In practice, the term SLA is sometimes used to refer to the contracted delivery time or performance...

 or other conditions which make successful intruder entry a less costly problem than loss of or reduction in service. The PaX approach is thus not well suited to all circumstances; however, in many cases, it is an acceptable method of protecting confidential information by preventing successful security breaches.

Many, but not all, programming bugs cause memory corruption. Of those that do, and are triggerable by intent, some will make it possible to induce the program to do various things it was not meant to, such as give a high-privileged shell. The focus of PaX is not on the finding and fixing of such bugs, but rather on prevention and containment of exploit techniques which may stem from such programmer error. A subset
Subset
In mathematics, especially in set theory, a set A is a subset of a set B if A is "contained" inside B. A and B may coincide. The relationship of one set being a subset of another is called inclusion or sometimes containment...

 of these bugs will be reduced in severity; programs terminate, rather than improperly provide service.

PaX does not directly prevent buffer overflows; instead, it effectively prevents many of these and related programming bugs from being used to gain unauthorized entry into a computer system. Other systems such as Stack-Smashing Protector and StackGuard do attempt to directly detect buffer overflows, and kill the offending program when identified; this approach is called stack-smashing protection
Stack-smashing protection
Buffer overflow protection refers to various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables as they occur and preventing them from becoming serious security vulnerabilities...

, and attempts to block such attacks before they can be made. PaX's more general approach, on the other hand, prevents damage after the attempt begins. Although both approaches can achieve some of the same goals, they are not entirely redundant. Therefore, employing both will, in principle, make an operating system more secure. Some Linux distributions already use the PaX with Stack Smash Protection combination.

As of mid 2004, PaX has not been submitted for the mainline kernel tree because The PaX Team does not think it yet appropriate; although PaX is fully functional on many CPU architectures, including the popular x86 architecture used by most, it still remains partially or fully unimplemented on some architectures. Those that PaX is effective on include IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...

 (x86), AMD64, IA-64, Alpha
DEC Alpha
Alpha, originally known as Alpha AXP, is a 64-bit reduced instruction set computer instruction set architecture developed by Digital Equipment Corporation , designed to replace the 32-bit VAX complex instruction set computer ISA and its implementations. Alpha was implemented in microprocessors...

, PA-RISC
PA-RISC
PA-RISC is an instruction set architecture developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer architecture, where the PA stands for Precision Architecture...

, and 32
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 and 64 bit
64-bit
64-bit is a word size that defines certain classes of computer architecture, buses, memory and CPUs, and by extension the software that runs on them. 64-bit CPUs have existed in supercomputers since the 1970s and in RISC-based workstations and servers since the early 1990s...

 MIPS
MIPS architecture
MIPS is a reduced instruction set computer instruction set architecture developed by MIPS Technologies . The early MIPS architectures were 32-bit, and later versions were 64-bit...

, PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

, and SPARC
SPARC
SPARC is a RISC instruction set architecture developed by Sun Microsystems and introduced in mid-1987....

 architectures.

Limitations

PaX cannot block fundamental design flaws in either executable programs or in the kernel that allow an exploit to abuse supplied services, as these are in principle undetectable. For example, a script engine which allows file and network access may allow malicious scripts to steal confidential data through privileged users' accounts. PaX also cannot block some format string bug based attacks, which may allow arbitrary reading from and writing to data locations in memory using already existing code; the attacker does not need to know any internal addresses or inject any code into a program to execute these types of attacks.

The PaX documentation http://pax.grsecurity.net/docs/pax.txt, maintained on the PaX Web site, describes three classes of attacks which PaX attempts to protect against. The documentation discusses both attacks for which PaX will be effective in protecting a system and those for which it will not. All assume a full, position independent executable base with full Executable Space Protections and full Address Space Layout Randomization. Briefly, then, blockable attacks are:
  1. Those which introduce and execute arbitrary code. These types of attacks frequently involve shellcode.
  2. Those which attempt to execute existing program code out of the original order intended by the computer programmer(s). This is commonly called a return-to-libc attack, or ret2libc for short.
  3. Those which attempt to execute existing program code in the intended order with arbitrary data. This issue existed in zlib
    Zlib
    zlib is a software library used for data compression. zlib was written by Jean-Loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. Zlib is also a crucial component of many software platforms including Linux, Mac OS X,...

     versions before 1.1.4—a corrupt compressed stream could cause a double-free.


Because PaX is aimed at preventing damage from such attacks rather than finding and fixing the bugs that permit them, it is not yet possible to prevent all attacks; indeed, preventing all attacks is impossible
Rice's theorem
In computability theory, Rice's theorem states that, for any non-trivial property of partial functions, there is no general and effective method to decide whether an algorithm computes a partial function with that property...

.

The first class of attacks is still possible with 100% reliability in spite of using PaX if the attacker does not need advance knowledge of addresses in the attacked task.

The second and third classes of attacks are also possible with 100% reliability, if the attacker needs advance knowledge of address space layout and can derive this knowledge by reading the attacked task's address space. This is possible if the target has a bug which leaks information, e.g., if the attacker has access to /proc/(pid)/maps. There is an obscurity patch which NULLs out the values for the address ranges and inodes in every information source accessible from userland to close most of these holes; however, it is not currently included in PaX.

The second and third classes of attacks are possible with a small probability if the attacker needs advance knowledge of address space layout, but cannot derive this knowledge without resorting to guessing or to a brute force search. The ASLR documentation http://pax.grsecurity.net/docs/aslr.txt describes how one can further quantify the "small probability" these attacks have of success.

The first class of attacks is possible if the attacker can have the attacked task create, write to, and mmap
Mmap
In computing, mmap is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all...

 a file. This in turn requires the second attack method to be possible, so an analysis of that applies here as well. Although not part of PaX, it is recommended—among other things—that production systems use an access control system that prevents this type of attack.

Responsible system administration is still required even on PaXified systems. PaX prevents or blocks attacks which exploit memory corruption
Memory corruption
Memory corruption happens when the contents of a memory location are unintentionally modified due to programming errors; this is known as violating memory safety. When the corrupted memory contents are used later in the computer program, it leads either to program crash or to strange and bizarre...

 bugs, such as those leading to shellcode and ret2libc attacks. Most attacks that PaX can prevent are related to buffer overflow bugs. This group includes the most common schemes used to exploit memory management problems. Still, PaX cannot prevent all of such attacks.

What PaX offers

PaX offers executable space protection
Executable space protection
In computer security, executable space protection is the marking of memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception...

, using (or emulating in operating system software) the functionality of an NX bit
NX bit
The NX bit, which stands for No eXecute, is a technology used in CPUs to segregate areas of memory for use by either storage of processor instructions or for storage of data, a feature normally only found in Harvard architecture processors...

 (i.e., built-in CPU
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

/MMU
Memory management unit
A memory management unit , sometimes called paged memory management unit , is a computer hardware component responsible for handling accesses to memory requested by the CPU...

 support for memory contents execution privilege tagging). It also provides address space layout randomization
Address space layout randomization
Address space layout randomization is a computer security method which involves randomly arranging the positions of key data areas, usually including the base of the executable and position of libraries, heap, and stack, in a process's address space.- Benefits :Address space randomization hinders...

 to defeat ret2libc attacks and all other attacks relying on known structure of a program's virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

.

Executable space protections

The major feature of PaX is the executable space protection it offers. These protections take advantage of the NX bit on certain processors to prevent the execution of arbitrary code. This staves off attacks involving code injection or shellcode. On IA-32 CPUs where there is no NX bit, PaX can emulate the functionality of one in various ways.

Many operating systems, Linux included, take advantage of existing NX functionality in hardware to apply proper restrictions to memory. Fig. 1 shows a simple set of memory segments in a program with one loaded library; green segments are data and blue are code. In normal cases, the address space on AMD64 and other such processors will by default look more like Fig. 1, with clearly defined data and code. Unfortunately, Linux by default does not prohibit an application from changing any of its memory protections; any program may create data-code confusion, marking areas of code as writable and areas of data as executable. PaX prevents such changes, as well as guaranteeing the most restrictive default set suitable for typical operation.

When the Executable Space Protections are enabled, including the mprotect restrictions, PaX guarantees that no memory mappings will be marked in any way in which they may be executed as program code after it has been possible to alter them from their original state. The effect of this is that it becomes impossible to execute memory during and after it has been possible to write to it, until that memory is destroyed; and thus, that code cannot be injected into the application, malicious or otherwise, from an internal or external source.

The fact that programs cannot themselves execute data they originated as program code poses an impassable problem for applications that need to generate code at runtime as a basic function, such as just-in-time compilers for Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

; however, most programs that have difficulty functioning properly under these restrictions can be debugged
Debugging
Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge...

 by the programmer and fixed so that they do not rely on this functionality. For those that simply need this functionality, or those that haven't yet been fixed, the program's executable file can be marked by the system administrator so that it does not have these restrictions applied to it.

The PaX team had to make some design decisions about how to handle the mmap system call
System call
In computing, a system call is how a program requests a service from an operating system's kernel. This may include hardware related services , creating and executing new processes, and communicating with integral kernel services...

. This function is used to either map shared memory
Shared memory
In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...

, or to load shared libraries. Because of this, it needs to supply writable or executable RAM, depending on the conditions it is used under.

The current implementation of PaX supplies writable anonymous memory mappings by default; file backed memory mappings are made writable only if the mmap call specifies the write permission. The mmap function will never return mappings that are both writable and executable, even if those permissions are explicitly requested in the call.

Enforced non-executable pages

By default, Linux does not supply the most secure usage of non-executable memory pages, via the NX bit. Furthermore, some architectures do not even explicitly supply a way of marking memory pages non-executable. PaX supplies a policy
Security policy
Security policy is a definition of what it means to be secure for a system, organization or other entity. For an organization, it addresses the constraints on behavior of its members as well as constraints imposed on adversaries by mechanisms such as doors, locks, keys and walls...

 to take advantage of non-executable pages in the most secure way possible.

In addition, if the CPU does not provide an explicit NX bit, PaX can emulate (supply) an NX bit by one of several methods. This degrades performance of the system, but increases security greatly. Furthermore, the performance loss in some methods may be low enough to be ignored.
PAGEEXEC

PAGEEXEC uses or emulates an NX bit. On processors which do not support a hardware NX, each page is given an emulated NX bit. The method used to do this is based on the architecture of the CPU. If a hardware NX bit is available, PAGEEXEC will use it instead of emulating one, incurring no performance costs.

On IA-32 architectures, NX bit emulation is done by changing the permission level of non-executable pages. The Supervisor bit is overloaded to represent NX. This causes a protection fault when access occurs to the page and it is not yet cached in the translation lookaside buffer
Translation Lookaside Buffer
A translation lookaside buffer is a CPU cache that memory management hardware uses to improve virtual address translation speed. All current desktop and server processors use a TLB to map virtual and physical address spaces, and it is ubiquitous in any hardware which utilizes virtual memory.The...

. In this case, the memory management unit alerts the operating system; on IA-32, the MMU typically has separate TLB caches for execution (ITLB) and read/write (DTLB), so this fault also allows Linux and PaX to determine whether the program was trying to execute the page as code. If an ITLB fault is caught, the process is terminated; otherwise Linux forces a DTLB load to be allowed, and execution continues as normal.

PAGEEXEC has the advantage of not dividing the memory address space in half; tasks still each get a 3 GB
Gigabyte
The gigabyte is a multiple of the unit byte for digital information storage. The prefix giga means 109 in the International System of Units , therefore 1 gigabyte is...

 virtual ramspace rather than a 1.5/1.5 split. However, for emulation, it is slower than SEGMEXEC and caused a severe performance
Computer performance
Computer performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.Depending on the context, good computer performance may involve one or more of the following:...

 detriment in some cases.

Since May 2004, the newer PAGEEXEC code for IA-32 in PaX tracks the highest executable page in virtual memory, and marks all higher pages as user pages. This allows data pages above this limit—such as the stack—to be handled as normal, with no performance loss. Everything below this area is still handled as before. This change is similar to the Exec Shield
Exec Shield
Exec Shield is a project started at Red Hat, Inc in late 2002 with the aim of reducing the risk of worm or other automated remote attacks on Linux systems. The first result of the project was a security patch for the Linux kernel that emulates an NX bit on x86 CPUs that lack a native NX...

 NX implementation, and the OpenBSD
OpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

 W^X
W^X
W^X is the name of a security feature present in the OpenBSD operating system. It is a memory protection policy whereby every page in a process' address space is either writable or executable, but not both simultaneously...

 implementation; except that PaX uses the Supervisor bit overloading method to handle NX pages in the code segment as well.
SEGMEXEC

SEGMEXEC emulates the functionality of an NX bit on IA-32 (x86) CPUs by splitting the address space in half and mirroring the code mappings across the address space. When there is an instruction fetch, the fetch is translated across the split. If the code is not mapped there, then the program is killed.

SEGMEXEC cuts the task's virtual memory space in half. Under normal circumstances, programs get a VM space 3GiB wide, which has physical memory mapped into it. Under SEGMEXEC, this becomes a 1.5/1.5 GiB split, with the top half used for the mirroring. Despite this, it does increase performance if emulation must be done on IA-32 (x86) architectures. The mapping in the upper and lower half of the memory space is to the same physical memory page, and so does not double RAM usage.

Restricted mprotect

PaX is supposed to guarantee that no RAM is both writable and executable. One function, the mprotect function, changes the permissions on a memory area. The Single UNIX Specification
Single UNIX Specification
The Single UNIX Specification is the collective name of a family of standards for computer operating systems to qualify for the name "Unix"...

 defines mprotect with the following note in its description:
If an implementation cannot support the combination of access types specified by prot, the call to mprotect shall fail.


The PaX implementation does not allow a memory page to have permissions PROT_WRITE and PROT_EXEC both enabled when mprotect restrictions are enabled for the task; any call to mprotect to set both (PROT_WRITE | PROT_EXEC) at the same time will fail due to EACCESS (Permission Denied). This guarantees that pages will not become both writable and executable, and thus fertile ground for simple code injection attacks.

Similar failure occurs if mprotect(...|PROT_EXEC) occurs on a page that does not have the PROT_EXEC restriction already on. The failure here is justified; if a PROT_WRITE page has code injected into it, and then is made PROT_EXEC, a later retriggering of the exploit allowing code injection will allow the code to be executed. Without this restriction, a three step exploit is possible: Inject code, ret2libc::ret2mprotect, execute code.

With mprotect restrictions enabled, a program can no longer violate the non-executable pages policy that PaX initially sets down on all memory allocations; thus, restricted mprotect could be considered to be strict enforcement of the security policy, whereas the "Enforced non-executable pages" without these restrictions could be considered to be a looser form of enforcement.

Trampoline emulation

Trampolines
Trampoline (computers)
In computer programming, the word trampoline has a number of meanings, and is generally associated with jumps .- Low Level Programming :...

 are usually implemented by gcc
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 as small pieces of code generated at runtime on the stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...

. Thus, they require executing memory on the stack, which triggers PaX to kill the program.

Because trampolines are runtime generated code
Self-modifying code
In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...

, they trigger PaX and cause the program using them to be killed. PaX is capable of identifying the setup of trampolines and allowing their execution. This is, however, considered to produce a situation of weakened security.

Address space layout randomization

Address space layout randomization, or ASLR, is a technique of countering arbitrary execution of code, or ret2libc attacks. These attacks involve executing already existing code out of the order intended by the programmer.

ASLR as provided in PaX shuffles the stack base and heap base around in virtual memory when enabled. It also optionally randomizes the mmap base and the executable base of programs. This substantially lowers the probability of a successful attack by requiring the attacking code to guess the locations of these areas.

Fig. 2 shows qualitative views of process' address spaces with address space layout randomization. The half-head arrows indicate a random gap between various areas of virtual memory. At any point when the kernel initializes the process, the length of these arrows can be considered to grow longer or shorter from this template independent of each other.

During the course of a program's life, the heap, also called the data segment
Data segment
A data segment is a portion of virtual address space of a program, which contains the global variables and static variables that are initialized by the programmer...

 or .bss, will grow up; the heap expands towards the highest memory address
Memory address
A digital computer's memory, more specifically main memory, consists of many memory locations, each having a memory address, a number, analogous to a street address, at which computer programs store and retrieve, machine code or data. Most application programs do not directly read and write to...

 available. Conversely, the stack grows down, towards the lowest memory address, 0.

It is extremely uncommon for a program to require a large percent of the address space for either of these. When program libraries are dynamically loaded at the start of a program by the operating system, they are placed before the heap; however, there are cases where the program will load other libraries, such as those commonly referred to as plugins, during run. The operating system or program must chose an acceptable offset to place these libraries at.

PaX leaves a portion of the addresses, the MSBs
Most significant bit
In computing, the most significant bit is the bit position in a binary number having the greatest value...

, out of the randomization calculations. This helps assure that the stack and heap are placed so that they do not collide with each other, and that libraries are placed so that the stack and heap do not collide with them.

The effect of the randomization depends on the CPU. 32-bit CPUs will have 32 bits of virtual address space, allowing access to 4GiB of memory. Because Linux uses the top 1 GB for the kernel, this is shortened to 3GiB. SEGMEXEC supplies a split down the middle of this 3GiB address space, restricting randomization down to 1.5GiB. Pages are 4KiB in size, and randomizations are page aligned. The top four MSBs are discarded in the randomization, so that the heap exists at the beginning and the stack at the end of the program. This computes down to having the stack and heap exist at one of several million positions (23 and 24 bit randomization), and all libraries existing in any of approximately 65,000 positions.

On 64 bit CPUs, the virtual address space supplied by the MMU may be wider, allowing access to more memory. The randomization will be more entropic in such situations, further reducing the probability of a successful attack in the lack of an information leak.

Randomized stack base

PaX randomly offsets the base of the stack in increments of 16 bytes, combining random placement of the actual virtual memory segment with a sub-page stack gap. The total magnitude of the randomization depends on the size of virtual memory space; for example, the stack base is somewhere in a 256MiB range on 32-bit architectures, giving 16 million possible positions or 24 bits of entropy.

The randomization of the stack base has an effect on payload delivery during shellcode and return-to-libc attacks. Shellcode attacks modify the return pointer field to the address of the payload; while return-to-libc attacks modify the stack frame pointer. In either case, the probability of success is diminished significantly; the position of the stack is unpredictable, and missing the payload likely causes the program to crash.

In the case of shellcode, a series of instructions called a NOP slide
NOP slide
In computer CPUs, a NOP slide, NOP sled or NOP ramp is a sequence of NOP instructions meant to "slide" the CPU's instruction execution flow to its final, desired, destination...

 or NOP sled can be prepended to the payload. This will add one more success case per 16 bytes of NOP slide. 16 bytes of NOP slide increase the success rate from 1/16M to 2/16M; 128 bytes of NOP slide increase this to 9/16M. The increase in success rate is directly proportional to the size of the NOP slide; doubling the length of any given NOP slide doubles the chances of a successful attack.

Return-to-libc attacks do not use code, but rather inject fixed width stack frames. Because of this, stack frames have to repeat exactly aligned to 16 bytes. Often a stack frame will be bigger than this, giving repeated stack frame payloads of the same length as a given NOP sled less of an impact on the success rate of attacks.

Randomized mmap base

In POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 systems, the mmap system call allows for memory to be allocated at offsets specified by the process or selected by the kernel. This can be anonymous memory with nothing in it; or file backed memory mappings, which simulate a portion of a file or a copy of said portion to be in memory at that point. Program libraries
Library (computer science)
In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

 are loaded in by using mmap to map their code and data private—the files are copied to memory if they are changed, rather than rewritten on disk.

Any mmap call may or may not specify an offset in virtual memory to allocate the mapping at. If an offset is not specified, it is up to the operating system to select one. Linux does this by calculating an offset in a predictable manner, starting from a predefined virtual address called the mmap base. Because of this, every run of a process loads initial libraries such as the C standard library
C standard library
The C Standard Library is the standard library for the programming language C, as specified in the ANSI C standard.. It was developed at the same time as the C POSIX library, which is basically a superset of it...

 or libc in the same place.

When Randomized mmap base is enabled, PaX randomly shifts the mmap base, affecting the positioning of all libraries and other non-specific mmap calls. This causes all dynamically linked code, i.e. shared objects, to be mapped at a different, randomly selected offset every time. Attackers requiring a function in a certain library must guess where that library is loaded in virtual memory space to call it. This makes return-to-libc attacks difficult; although shellcode injections can still look up the address of any function in the global offset table.

PaX does not change the load order of libraries. This means if an attacker knows the address of one library, he can derive the locations of all other libraries; however, it is notable that there are more serious problems if the attacker can derive the location of a library in the first place, and extra randomization will not likely help that. Further, typical attacks only require finding one library or function; other interesting elements such as the heap and stack are separately randomized and are not derivable from the mmap base.

When ET_DYN executables—that is, executables compiled with position independent code in the same way as shared libraries—are loaded, their base is also randomly chosen, as they are mmaped into RAM just like regular shared objects.

When combining a non-executable stack with mmap base randomization, the difficulty in exploiting bugs protected against by PaX is greatly increased due to the forced use of return-to-libc attacks. On 32-bit systems, this amounts to 16 orders of magnitude
Order of magnitude
An order of magnitude is the class of scale or magnitude of any amount, where each class contains values of a fixed ratio to the class preceding it. In its most common usage, the amount being scaled is 10 and the scale is the exponent being applied to this amount...

; that is, the chances of success are recursively halved 16 times. Combined with stack randomization, the effect can be quite astounding; if every person in the world (assuming 6 billion total) attacks the system once, roughly 1 to 2 should succeed on a 32-bit system. 64-bit systems of course benefit from greater randomization.

Randomized ET_EXEC base

PaX is able to map non-position-independent code randomly into RAM; however, this poses a few problems. First, it incurs some extra performance overhead. Second, on rare occasions it causes false alarms, bringing PaX to kill the process
Process (computing)
In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...

 for no reason. It is strongly recommended that executables be compiled ET_DYN, so that they are 100% position independent code.

The randomization of the executable load base for ET_EXEC fixed position executables was affected by a security flaw in the VM mirroring code in PaX. For those that hadn't upgraded, the flaw could be worked around by disabling SEGMEXEC NX bit emulation and RANDEXEC randomization of the executable base.

Binary markings

PaX allows executable files in the Executable and Linkable Format
Executable and Linkable Format
In computing, the Executable and Linkable Format is a common standard file format for executables, object code, shared libraries, and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among...

 to be marked with reduced restrictions via the chpax and paxctl tools. These markings exist in the ELF header, and thus are both filesystem independent and part of the file object itself. This means that the markings are retained through packaging, copying, archiving, encrypting
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...

, and moving of the objects. The chpax tool is deprecated in favor of paxctl.

PaX allows individual markings for both PAGEEXEC and SEGMEXEC; randomizing the mmap, stack, and heap base; randomizing the executable base for ET_EXEC binaries; restricting mprotect; and emulating trampolines.

In the case of chpax, certain tools such as strip may lose the markings; using paxctl to set the PT_PAX_FLAGS is the only reliable method. The paxctl tool uses a new ELF program header specifically created for PaX flags. These markings can be explicitly on, off, or unset. When unset, the decision on which setting to use is made by the PaX code in the kernel, and is influenced by the system-wide PaX softmode setting.

Distributions that use PaX

The grsecurity
Grsecurity
grsecurity is a set of patches for the Linux kernel with an emphasis on enhancing security. Its typical application is in computer systems that accept remote connections from untrusted locations, such as web servers and systems offering shell access to its users.Released under the GNU General...

 project supplies several Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 kernel security enhancements, and supplies PaX along with those features unique to grsecurity. Hardened Gentoo
Hardened Gentoo
Hardened Gentoo is a project of Gentoo Linux that is enhancing the distribution with security addons. Current security enhancements to Gentoo Linux can be:*SELinux**A system of mandatory access controls...

, a security-enhanced version of Gentoo Linux
Gentoo Linux
Gentoo Linux is a computer operating system built on top of the Linux kernel and based on the Portage package management system. It is distributed as free and open source software. Unlike a conventional software distribution, the user compiles the source code locally according to their chosen...

, may be configured to use both grsecurity and PaX. Tor-ramdisk
Tor-ramdisk
Tor-ramdisk is an i686 uClibc-based micro Linux distribution whose only purpose is to host a Tor server in an environment that maximizes security and privacy. Tor is a network of virtual tunnels that allows people and groups to improve their privacy and security on the Internet...

, an i686 uClibc
UClibc
In computing, uClibc is a small C standard library intended for embedded Linux systems. uClibc was created to support uClinux, a version of Linux not requiring a memory management unit and thus suited for microcontrollers .The project lead is Erik Andersen. The other main contributor is Manuel...

 micro Linux distribution that functions solely as a secure Tor server, ships with a GRSEC
Grsecurity
grsecurity is a set of patches for the Linux kernel with an emphasis on enhancing security. Its typical application is in computer systems that accept remote connections from untrusted locations, such as web servers and systems offering shell access to its users.Released under the GNU General...

/PaX patched kernel by default. Alpine Linux
Alpine Linux
Alpine Linux is a Linux distribution based on uClibc and BusyBox, which has the goal of being lightweight and secure by default while still being useful for general-purpose tasks. Alpine Linux uses PaX and grsecurity patches in the default kernel and compiles all packages with stack-smashing...

 also ships with a PaX enabled kernel and has Gnome packages available.

History

This is an incomplete history of PaX to be updated as more information is located.
  • October, 2000: PaX first released with basic PAGEEXEC method
  • November, 2000: first incarnation of MPROTECT released
  • June, 2001: ASLR (mmap randomization) implemented, not released
  • July, 2001: ASLR released
  • August, 2001: ASLR with additional stack and PIE randomization released
  • July, 2002: VMA Mirroring and RANDEXEC released
  • October, 2002: SEGMEXEC released
  • October, 2002: ASLR with additional kernel stack randomization released
  • February, 2003: EI_PAX ELF marking method introduced
  • April, 2003: KERNEXEC (non-executable kernel pages) released
  • July, 2003: ASLR with additional brk randomization released
  • February, 2004: PT_PAX_FLAGS ELF marking method introduced
  • May, 2004: PAGEEXEC augmented with code segment limit tracking for enhanced performance
  • March 4, 2005: VMA Mirroring vulnerability announced, new versions of PaX and grsecurity released, all prior versions utilizing SEGMEXEC and RANDEXEC have a privilege escalation vulnerability
  • April 1, 2005: Due to that vulnerability, the PaX project was scheduled to be taken over by a new developer, but since no candidate showed up, the old developer has continued maintenance ever since.

See also

  • Security-Enhanced Linux
    Security-Enhanced Linux
    Security-Enhanced Linux is a Linux feature that provides a mechanism for supporting access control security policies, including United States Department of Defense-style mandatory access controls, through the use of Linux Security Modules in the Linux kernel...

  • Intrusion-detection system
    Intrusion-detection system
    An intrusion detection system is a device or software application that monitors network and/or system activities for malicious activities or policy violations and produces reports to a Management Station. Some systems may attempt to stop an intrusion attempt but this is neither required nor...

  • Intrusion-prevention system
    Intrusion-prevention system
    Intrusion Prevention Systems , also known as Intrusion Detection and Prevention Systems , are network security appliances that monitor network and/or system activities for malicious activity. The main functions of intrusion prevention systems are to identify malicious activity, log information...

  • RSBAC
    RSBAC
    RSBAC is an open source access control framework for current Linux kernels, which has been in stable production use since January 2000 .-Features:*Free open source Linux kernel security extension....


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK