Stack-smashing protection
Encyclopedia
Buffer overflow protection refers to various techniques used during software development to enhance the security of executable programs by detecting buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

s on stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...

-allocated variables as they occur and preventing them from becoming serious security
Computer security
Computer security is a branch of computer technology known as information security as applied to computers and networks. The objective of computer security includes protection of information and property from theft, corruption, or natural disaster, while allowing the information and property to...

 vulnerabilities. There have been several implementations of buffer overflow protection.

This article deals with stack-based overflow; similar protections also exist against heap-based overflows, but they are implementation-specific.

How it works

Typically, buffer overflow protection modifies the organization of data in the stack frame of a function call to include a "canary" value which, when destroyed, shows that a buffer preceding it in memory has been overflowed. This gives the benefit of preventing an entire class of attacks. According to some software vendors, the performance impact of these techniques is negligible.

Canaries

Canaries or canary words are known values that are placed between a buffer and control data on the stack to monitor buffer overflows. When the buffer overflows, the first data to be corrupted will be the canary, and a failed verification of the canary data is therefore an alert of an overflow, which can then be handled, for example, by invalidating the corrupted data.

The terminology is a reference to the historic practice of using canaries in coal mines, since they would be affected by toxic gases earlier than the miners, thus providing a biological warning system.

There are three types of canaries in use: Terminator, Random, and Random XOR. Current versions of StackGuard support all three, while ProPolice supports Terminator and Random canaries.

Terminator canaries

Terminator Canaries use the observation that most buffer overflow attacks are based on certain string operations which end at terminators. The reaction to this observation is that the canaries are built of NULL
Null
-In computing:* Null , a special marker and keyword in SQL* Null character, the zero-valued ASCII character, also designated by NUL, often used as a terminator, separator or filler* Null device, a special computer file that discards all data written to it...

 terminators, CR, LF, and -1. The undesirable result is that the canary is known. Even with the protection, an attacker could potentially overwrite the canary with its known value, and control information with mismatched values, thus passing the canary check code, this latter being executed soon before the specific processor return-from-call instruction.

Random canaries

Random canaries are randomly generated, usually from an entropy
Entropy (computing)
In computing, entropy is the randomness collected by an operating system or application for use in cryptography or other uses that require random data...

-gathering daemon
Daemon (computer software)
In Unix and other multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user...

, in order to prevent an attacker from knowing their value. Usually, it is not logically possible or plausible to read the canary for exploiting; the canary is a secure value known only by those who need to know it—the buffer overflow protection code in this case.

Normally, a random canary is generated at program initialization, and stored in a global variable. This variable is usually padded by unmapped pages, so that attempting to read it using any kinds of tricks that exploit bugs to read off RAM cause a segmentation fault, terminating the program. It may still be possible to read the canary, if the attacker knows where it is, or can get the program to read from the stack.

Random XOR canaries

Random XOR Canaries are Random Canaries that are XOR scrambled using all or part of the control data. In this way, once the canary or the control data is clobbered, the canary value is wrong.

Random XOR Canaries have the same vulnerabilities as Random Canaries, except that the 'read from stack' method of getting the canary is a bit more complicated. The attacker must get the canary, the algorithm, and the control data to generate the original canary for re-encoding into the canary he needs to use to spoof the protection.

In addition, Random XOR Canaries can protect against a certain type of attack involving overflowing a buffer in a structure into a pointer to change the pointer to point at a piece of control data. Because of the XOR encoding, the canary will be wrong if the control data or return value is changed. Because of the pointer, the control data or return value can be changed without overflowing over the canary.

Although these canaries protect the control data from being altered by clobbered pointers, they do not protect any other data or the pointers themselves. Function pointers especially are a problem here, as they can be overflowed into and will execute shellcode
Shellcode
In computer security, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine. Shellcode is commonly written in...

 when called.

Attacks that cannot be protected against

Stack-smashing protection is unable to protect against certain forms of attack. For example, it cannot protect against buffer overflows in the heap.

StackGuard and ProPolice cannot protect against overflows in automatically allocated structures which overflow into function pointers. ProPolice at least will rearrange the allocation order to get such structures allocated before function pointers. A separate mechanism for pointer protection was proposed in PointGuard and is available on Microsoft Windows.

There is no sane way to alter the layout of data within a structure; structures are expected to be the same between modules, especially with shared libraries. Any data in a structure after a buffer is impossible to protect with canaries; thus, programmers must be very careful about how they organize their variables and use their structures. Structures with buffers should either be malloc
Malloc
C dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....

ed or obtained with new.

StackGuard

StackGuard was released for GCC in 1997, and published at USENIX Security 1998. StackGuard is an extension to GCC
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 that provides buffer overflow protection. It was invented by Crispin Cowan, first implemented as a zero canary in the i386 backend for GCC 2.7.2.2 by Aaron Grier, and verified by Peat Bakke. Perry Wagle continued maintenance of StackGuard for the Immunix project, and implemented the Terminator, Random, and Random XOR canaries.

StackGuard was made available as a standard part of the Immunix
Immunix
Immunix was a commercial operating system that provided host-based application security solutions. The last release of Immunix's GNU/Linux distribution was version 7.3 on November 27, 2003. Immunix, Inc. was the creator of AppArmor, an application security system.On May 10, 2005, Novell acquired...

 Linux distribution from 1998 to 2003, providing both Red Hat-compatible binary RPMs and patched GCC sources from GCC 2.7.2.3 through 2.96.

StackGuard was suggested for implementation in GCC according to the GCC 2003 Summit Proceedings and the StackGuard homepage; however, gcc
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 3.x offers no official buffer overflow protection, and the SSP concept below has been adapted for GCC 4.1 instead.

GCC Stack-Smashing Protector (ProPolice)

The "Stack-Smashing Protector" or SSP, also known as ProPolice, is an enhancement of the StackGuard concept written and maintained by Hiroaki Etoh of IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

. Its name derives from the word propolis
Propolis
Propolis is a resinous mixture that honey bees collect from tree buds, sap flows, or other botanical sources. It is used as a sealant for unwanted open spaces in the hive. Propolis is used for small gaps , while larger spaces are usually filled with beeswax. Its color varies depending on its...

. ProPolice differs from StackGuard in three ways:
  • ProPolice moves canary code generation from the back-end to the front-end of the compiler.
  • ProPolice also protects all registers saved in function's prologue (for example the frame pointer), and not only the Return Address
    Return address
    In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient with a means to determine how to respond to the sender of the message if needed....

    .
  • ProPolice, in addition to canary protection, also sorts array variables (where possible) to the highest part of the stack frame, to make it more difficult to overflow them and corrupt other variables. It also creates copies of arguments of the function, and relocates them together with local variables, effectively protecting the arguments.


It was implemented as a patch to GCC
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 3.x; a less intrusive reimplementation is included in the GCC 4.1 release. Currently, SSP is standard in OpenBSD
OpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

, FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

 (since 8.0), Ubuntu (since 6.10 ), Hardened Gentoo
Hardened Gentoo
Hardened Gentoo is a project of Gentoo Linux that is enhancing the distribution with security addons. Current security enhancements to Gentoo Linux can be:*SELinux**A system of mandatory access controls...

 (in gcc 4.x by default since October 2010, previously also in gcc 3.x) and DragonFly BSD
DragonFly BSD
DragonFly BSD is a free Unix-like operating system created as a fork of FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and a FreeBSD developer between 1994 and 2003, began work on DragonFly BSD in June 2003 and announced it on the FreeBSD mailing lists on July...

. It is also available in NetBSD
NetBSD
NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...

 (enabled by default on x86), Debian
Debian
Debian is a computer operating system composed of software packages released as free and open source software primarily under the GNU General Public License along with other free software licenses. Debian GNU/Linux, which includes the GNU OS tools and Linux kernel, is a popular and influential...

 and Gentoo
Gentoo Linux
Gentoo Linux is a computer operating system built on top of the Linux kernel and based on the Portage package management system. It is distributed as free and open source software. Unlike a conventional software distribution, the user compiles the source code locally according to their chosen...

, disabled by default.

By default, stack-smashing protection can be attained by adding the -fstack-protector flag for string protection, or -fstack-protector-all for protection of all types. On some systems, as under OpenBSD, ProPolice is enabled by default, and the -fno-stack-protector flag disables it. With GCC 4.1 and above the array size threshold for stack-smashing protection to be enabled can be tuned with --param ssp-buffer-size=....

Microsoft Visual Studio /GS

The compiler suite from Microsoft also implements buffer overflow protection since version 2003. /GS is enabled by default. Use /GS- if the protection must explicitly be disabled.

IBM Compiler

Stack-smashing protection can be turned on by the compiler flag -qstackprotect.

StackGhost (hardware-based)

Invented by Mike Frantzen, StackGhost is a simple tweak to the register window spill/fill routines which makes buffer overflows much more difficult to exploit. It uses a unique hardware feature of the Sun Microsystems
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

 SPARC
SPARC
SPARC is a RISC instruction set architecture developed by Sun Microsystems and introduced in mid-1987....

 architecture (that being: deferred on-stack in-frame register window spill/fill) to detect modifications of return pointers (a common way for an exploit
Exploit (computer security)
An exploit is a piece of software, a chunk of data, or sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic...

 to hijack execution paths) transparently, automatically protecting all applications without requiring binary or source modifications. The performance impact is negligible, less than one percent. The resulting gdb issues were resolved by Mark Kettenis two years later, allowing enabling of the feature. Following this event, the StackGhost code was integrated (and optimized) into OpenBSD
OpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...

/SPARC.

An example of canaries

Normal buffer allocation for x86 architectures and other similar architectures is shown in the buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

 entry. Here, we will show the modified process as it pertains to StackGuard.

When a function is called, a stack frame is created. A stack frame is built from the end of memory to the beginning; and each stack frame is placed on the top of the stack, closest to the beginning of memory. Thus, running off the end of a piece of data in a stack frame alters data previously entered into the stack frame; and running off the end of a stack frame places data into the previous stack frame. A typical stack frame may look as below, having a Return Address
Return address
In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient with a means to determine how to respond to the sender of the message if needed....

 (RETA) placed first, followed by other control information (CTLI).

(CTLI)(RETA)

In C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, a function may contain many different per-call data structures. Each piece of data created on call is placed in the stack frame in order, and is thus ordered from the end to the beginning of memory. Below is a hypothetical function and its stack frame.

int foo {
int a; /*integer*/
int *b; /*pointer to integer*/
char c[10]; /*character array*/
char d[3];
b = &a; /*initialize b to point to location of a*/
strcpy(c,get_c); /*get c from somewhere, write it to c*/
*b = 5; /*the data at the point in memory b indicates is set to 5*/
strcpy(d,get_d);
return *b; /*read from b and pass it to the caller*/
}

(d..)(c.........)(b...)(a...)(CTLI)(RETA)

In this hypothetical situation, if more than ten bytes are written to the array c, or more than 3 to the character array d, the excess will overflow into integer pointer b, then into integer a, then into the control information, and finally the return address. By overwriting b, the pointer is made to reference any position in memory, causing a read from an arbitrary address. By overwriting RETA, the function can be made to execute other code (when it attempts to return), either existing functions (ret2libc
Return-to-libc attack
A return-to-libc attack is a computer security attack usually starting with a buffer overflow in which the return address on the stack is replaced by the address of another instruction and an additional portion of the stack is overwritten to provide arguments to this function...

) or code written into the stack during the overflow.

In a nutshell, poor handling of c and d, such as the unbounded strcpy calls above, may allow an attacker to control a program by influencing the values assigned to c and d directly. The goal of buffer overflow protection is to detect this issue in the least intrusive way possible. This is done by removing what can be out of harms way and placing a sort of tripwire, or canary, after the buffer.

Buffer overflow protection is implemented as a change to the compiler. As such, it is possible for the protection to alter the structure of the data on the stack frame. This is exactly the case in systems such as ProPolice. The above function's automatic variables are rearranged more safely: arrays c and d are allocated first in the stack frame, which places integer a and integer pointer b before them in memory. So the stack frame becomes

(b...)(a...)(d..)(c.........)(CTLI)(RETA)

As it is impossible to move CTLI or RETA without breaking the produced code, another tactic is employed. An extra piece of information, called a "canary" (CNRY), is placed after the buffers in the stack frame. When the buffers overflow, the canary value is changed. Thus, to effectively attack the program, an attacker must leave definite indication of his attack. The stack frame is

(b...)(a...)(d..)(c.........)(CNRY)(CTLI)(RETA)

At the end of every function there is an instruction which continues execution from the memory address indicated by RETA. Before this instruction is executed, a check of CNRY ensures it has not been altered. If the value of CNRY fails the test, program execution is ended immediately. In essence, both serious attacks and harmless programming bugs result in a program abort.

The canary technique adds a few instructions of overhead for every function call with an automatic array, immediately before all dynamic buffer allocation and after dynamic buffer deallocation. The overhead generated in this technique is not significant. It does work, though, unless the canary remains unchanged. If the attacker knows that it's there, he may simply copy over it with itself. This is usually difficult to arrange intentionally, and highly improbable in unintentional situations.

The position of the canary is implementation specific, but it is always between the buffers and the protected data. Varied positions and lengths have varied benefits.

See also

  • Address space layout randomization
    Address space layout randomization
    Address space layout randomization is a computer security method which involves randomly arranging the positions of key data areas, usually including the base of the executable and position of libraries, heap, and stack, in a process's address space.- Benefits :Address space randomization hinders...

  • PaX
    PaX
    PaX is a patch for the Linux kernel that implements least privilege protections for memory pages. The least-privilege approach allows computer programs to do only what they have to do in order to be able to execute properly, and nothing more. PaX was first released in 2000.PaX flags data memory as...

  • Static code analysis
    Static code analysis
    Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...

  • Memory debugger
    Memory debugger
    A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK