Intel iAPX 432

Intel iAPX 432

Discussion
Ask a question about 'Intel iAPX 432'
Start a new discussion about 'Intel iAPX 432'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
The Intel iAPX 432 was a commercially unsuccessful 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 microprocessor
Microprocessor
A microprocessor incorporates the functions of a computer's central processing unit on a single integrated circuit, or at most a few integrated circuits. It is a multipurpose, programmable device that accepts digital data as input, processes it according to instructions stored in its memory, and...

 architecture, introduced in 1981.

The project was Intel's first 32-bit microprocessor design, and intended to be the company's main product line for the 1980s. Many advanced multitasking
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, share common processing resources such as a CPU. In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is actively executing instructions for...

 and memory management
Memory management
Memory management is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and freeing it for reuse when no longer needed. This is critical to the computer system.Several...

 features were implemented in hardware, leading to the design being referred to as a Micromainframe. ("iAPX" reportedly stands for intel Advanced Processor architecture, the X coming from the Greek letter Chi
Chi (letter)
Chi is the 22nd letter of the Greek alphabet, pronounced as in English.-Greek:-Ancient Greek:Its value in Ancient Greek was an aspirated velar stop .-Koine Greek:...

.)

Originally designed for clock frequencies of up to 10 MHz, actual devices sold were specified for maximum clock speeds of 4 MHz, 5 MHz, 7 MHz and 8 MHz respectively with a peak performance of 2 million instructions per second at 8 MHz.

Direct hardware support for various data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

s was intended to allow modern operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s for the iAPX 432 to be implemented using far less program code than for ordinary processors; combined with direct support for object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 and garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

, this made the hardware (mostly the microcode
Microcode
Microcode is a layer of hardware-level instructions and/or data structures involved in the implementation of higher level machine code instructions in many computers and other processors; it resides in special high-speed memory and translates machine instructions into sequences of detailed...

 part) much more complex than most processors of the era, especially microprocessors.

Early systems built around the iAPX 432 were rather slow and expensive, one of the reasons being that Intel's engineers were unable to translate this large design into a very efficient first implementation using the semiconductor technology of the day. A lack of optimization in a rather premature Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

 compiler developed for the processor was another major factor, perhaps more important than the processor implementation itself. In 1982, typical benchmark tests ran at roughly 1/4 the speed of the new and fast 80286
Intel 80286
The Intel 80286 , introduced on 1 February 1982, was a 16-bit x86 microprocessor with 134,000 transistors. Like its contemporary simpler cousin, the 80186, it could correctly execute most software written for the earlier Intel 8086 and 8088...

 chip at the same clock frequency. This initial performance gap was probably the main reason why Intel's plan to replace the 8086-line (later known as the x86 architecture) with the iAPX 432 failed.

Development


The 432 project started in 1975 as the 8800, so named as a follow-on to the existing 8008
Intel 8008
The Intel 8008 was an early byte-oriented microprocessor designed and manufactured by Intel and introduced in April 1972. It was an 8-bit CPU with an external 14-bit address bus that could address 16KB of memory...

 and 8080
Intel 8080
The Intel 8080 was the second 8-bit microprocessor designed and manufactured by Intel and was released in April 1974. It was an extended and enhanced variant of the earlier 8008 design, although without binary compatibility...

 CPUs. The design was intended to be purely 32-bit from the outset, and be the backbone of Intel's processor offerings in the 1980s. As such, it was to be considerably more powerful and complex than their existing "simple" offerings. However, the design was well beyond the capabilities of the existing process technology of the era, and had to be split into several individual chips.

The core of the design—the main processor—was termed the General Data Processor (GDP) and built as two chips: one (the 43201) to fetch and decode instructions, the other (the 43202) to execute them. Most systems would also include a third chip: the 43203 Interface Processor (IP) which operated as a channel controller for I/O.

These were some of the largest IC designs of the era. The two-chip GDP had a combined count of approximately 97,000 transistor
Transistor
A transistor is a semiconductor device used to amplify and switch electronic signals and power. It is composed of a semiconductor material with at least three terminals for connection to an external circuit. A voltage or current applied to one pair of the transistor's terminals changes the current...

s while the single chip IP had approximately 49,000. By comparison, the Motorola 68000
Motorola 68000
The Motorola 68000 is a 16/32-bit CISC microprocessor core designed and marketed by Freescale Semiconductor...

 (introduced in 1979) had approximately 40,000 transistors.

In 1983, Intel released two additional integrated circuits for the iAPX 432 Interconnect Architecture: the 43204 Bus Interface Unit (BIU) and 43205 Memory Control Unit (MCU). These chips allowed for nearly glueless multiprocessor systems with up to 63 nodes.

The project's failures


Several design features of the iAPX 432 conspired to make it a little slower at the hardware level than many contemporary designs. While a relatively minor problem was that the two-chip implementation of the GDP limited it to the speed of the motherboard's electrical wiring, a more serious issue was the lack of reasonable caches in the original implementation. The instruction set also used bit-aligned variable-length instructions (as opposed to the byte or word-aligned semi-fix formats used in the majority of computer designs), which made instruction decoding a little more complex than in most other designs. In addition, the BIU was designed to support fault-tolerant systems, and in doing so added considerable overhead to the bus, with up to 40% of the bus time held up in wait state
Wait state
A wait state is a delay experienced by a computer processor when accessing external memory or another device that is slow to respond.As of late 2011, computer microprocessors run at very high speeds, while memory technology does not seem to be able to catch up: typical PC processors like the Intel...

s.

However, post-project research suggested that the major problem was the Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

 compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 developed for it, which used high-cost "general" instructions in every case, instead of high-performance simpler ones where it would have made sense to do so. For instance the iAPX 432 included a very expensive inter-module procedure call instruction, which the compiler used for all calls, despite the existence of much faster branch and link instructions. Another very slow call was enter_environment, which set up the memory protection. The compiler ran this for every single variable in the system, even though the vast majority were running inside an existing environment and did not have to be checked. To make matters worse, data passed to and from procedures was always passed by value rather than by reference: in many cases requiring huge memory copies.

Impact and similar designs


An outcome of the failure of the 432 was that microprocessor designers concluded that object support in the chip leads to a complex design that will invariably run slowly, and the 432 was often cited as a counter-example by proponents of RISC designs. However it is held by some that the OO support was not the primary problem with the 432 and that the implementation shortcomings (especially in the compiler) mentioned above would have made any CPU design slow. Since the iAPX 432 there has been only one other attempt at a similar design, the Rekursiv
Rekursiv
Rekursiv was a computer processor designed by David M. Harland in the mid-1980s for Linn Smart Computing in Glasgow, Scotland. It was one of the few computer architectures intended to implement object-oriented concepts directly in hardware. The Rekursiv operated directly on objects rather than...

 processor, although the INMOS Transputer
INMOS transputer
The transputer was a pioneering microprocessor architecture of the 1980s, featuring integrated memory and serial communication links, intended for parallel computing. It was designed and produced by Inmos, a British semiconductor company based in Bristol....

's process support was similar — and very fast.

Intel had spent considerable time, money and mindshare on the 432, had a skilled team devoted to it, and were unwilling to abandon it entirely after its failure in the marketplace. A new architect—Glenford Myers—was brought in to produce an entirely new architecture and implementation for the core processor, which would be built in a joint Intel/Siemens
Siemens AG
Siemens AG is a German multinational conglomerate company headquartered in Munich, Germany. It is the largest Europe-based electronics and electrical engineering company....

 project (later BiiN
BiiN
BiiN was a company created out of a joint research project by Intel and Siemens to develop fault tolerant high-performance multi-processor computers build on custom microprocessor designs. BiiN was an outgrowth of the Intel iAPX 432 multiprocessor project, ancestor of iPSC and nCUBE.The company was...

), resulting in the i960
Intel i960
Intel's i960 was a RISC-based microprocessor design that became popular during the early 1990s as an embedded microcontroller, becoming a best-selling CPU in that field, along with the competing AMD 29000...

-series processors. The i960 RISC subset became popular for a time in the embedded processor market, but the high-end 960MC and the tagged-memory 960MX were marketed only for military applications.

Object-oriented memory and capabilities


The iAPX 432 has hardware and microcode support for object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 and capability-based addressing
Capability-based addressing
In computer science, capability-based addressing is a scheme used by some computers to control access to memory. Under a capability-based addressing scheme, pointers are replaced by protected objects that can only be created through the use of privileged instructions which may only be executed by...

. The system uses segmented memory, with up to 224 segments of up to 64 kB
Kilobyte
The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...

 each, providing a total virtual address space of 240 bytes. The physical address space is 224 bytes (16 MB
Megabyte
The megabyte is a multiple of the unit byte for digital information storage or transmission with two different values depending on context: bytes generally for computer memory; and one million bytes generally for computer storage. The IEEE Standards Board has decided that "Mega will mean 1 000...

).

Programs are not able to reference data or instructions by address; instead they must specify a segment and an offset within the segment. Segments are referenced by Access Descriptors (ADs), which provide an index into the system object table and a set of rights (capabilities
Capability-based security
Capability-based security is a concept in the design of secure computing systems, one of the existing security models. A capability is a communicable, unforgeable token of authority. It refers to a value that references an object along with an associated set of access rights...

) governing accesses to that segment. Segments may be "access segments", which can only contain Access Descriptors, or "data segments" which cannot contain ADs. The hardware and microcode rigidly enforce the distinction between data and access segments, and will not allow software to treat data as access descriptors, or vice versa.

System-defined objects consist of either a single access segment, or an access segment and a data segment. System-defined segments contain data or access descriptors for system-defined data at designated offsets, though the operating system or user software may extend these with additional data. Each system object has a type field which is checked by microcode, such that a Port Object cannot be used where a Carrier Object is needed. User program can define new object types which will get the full benefit of the hardware type checking, through the use of Type Control Objects (TCO).

In Release 1 of the iAPX 432 architecture, a system-defined object typically consisted of an access segment, and optionally (depending on the object type) a data segment specified by an access descriptor at a fixed offset within the access segment.

By Release 3 of the architecture, in order to improve performance, access segments and data segments were combined into single segments of up to 128 kB, split into an access part and a data part of 0–64 kB each. This reduced the number of object table lookups dramatically, and doubled the maximum virtual address space.

Garbage collection


Software running on the 432 does not need to explicitly deallocate objects that are no longer needed, and in fact no method is provided to do so. Instead, the microcode implements part of the marking portion of Edsger Dijkstra
Edsger Dijkstra
Edsger Wybe Dijkstra ; ) was a Dutch computer scientist. He received the 1972 Turing Award for fundamental contributions to developing programming languages, and was the Schlumberger Centennial Chair of Computer Sciences at The University of Texas at Austin from 1984 until 2000.Shortly before his...

's on-the-fly parallel garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 algorithm (a mark-sweep style collector). The entries in the system object table contain the bits used to mark each object as being white, black, or grey as needed by the collector.

The iMAX-432 operating system includes the software portion of the garbage collector.

Multitasking and interprocess communication


The iAPX 432 microcode implements multitasking, using objects in memory to represent the processor, processes, communication ports, and dispatching ports. Each processor is associated with a dispatching port, and when it is idle will attempt to dispatch a process from that dispatching port. When the process blocks or its time quantum expires, the processor re-enqueues that process at its dispatching port, then dispatches a new process from the dispatching port.

Interprocess communication is supported through the use of communication ports. A communication port is essentially a FIFO
FIFO
FIFO is an acronym for First In, First Out, an abstraction related to ways of organizing and manipulation of data relative to time and prioritization...

 that can enqueue either messages waiting to be received by a process, or processes waiting to receive a message (but never both). A program can use the Send, Receive, Conditional Send, Conditional Receive, Surrogate Send, or Surrogate Receive instructions to communicate with other processes by sending messages to or receiving messages from communication ports. If there is no message enqueued at a communication port, a normal Receive instruction on that port will block the current process until a message is available. Similarly, a normal Send instruction will block the current process if the port is full. The Conditional Send and Conditional Receive instructions do not block, instead returning a Boolean result indicating whether the operation succeeded. The Surrogate Send and Surrogate Receive instructions provide a Carrier object that can block in place of the process.

One of the elegant aspects of the iAPX 432 architecture is that a dispatching port is actually just a communication port whose messages are process objects, thus unifying the operation of process dispatching and interprocess communication and simplifying the underlying implementation.

Multiprocessing


The iAPX 432 has hardware support for multiprocessing
Multiprocessing
Multiprocessing is the use of two or more central processing units within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them...

, using up to 64 processors (combination of GDPs and IPs). Usually, all GDPs share a common workload by using a single system-wide dispatching port, though it is possible to partition the workload by assigning some processors to different dispatching ports. With suitably designed hardware, processors can be added to or removed from the system on the fly.

Fault tolerance


From the outset, the iAPX 432 included support for fault tolerance
Fault-tolerant design
In engineering, fault-tolerant design is a design that enables a system to continue operation, possibly at a reduced level , rather than failing completely, when some part of the system fails...

. All of the 432's chips could be configured in pairs for Functional Redundancy
Redundancy (engineering)
In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....

 Checking (FRC), in which one component, the master, operated normally, and a second, the checker, carried out the same internal operations in parallel and verified its results against those of the master.

FRC provides for failure detection, but full fault tolerance requires a recovery mechanism. Systems based on the Interconnect Architecture supported automatic failure recovery by combining pairs of FRC modules for Quad Modular Redundancy (QMR). In a QMR configuration, at any given time one FRC module is a primary and the other is a shadow. The two modules operate in lockstep, but the roles alternate to detect latent faults. The shadow module does not drive the bus. If a fault is detected in either FRC module, that module is disabled while the nonfaulted module can continue operation. The software is notified, and can choose to let the system continue operating (without fault tolerance for that module), pair the module with a spare, or take the module offline (shifting its workload to other processors in the system for graceful performance degradation
Fault-tolerant system
Fault-tolerance or graceful degradation is the property that enables a system to continue operating properly in the event of the failure of some of its components. A newer approach is progressive enhancement...

).

I/O


The 43203 Interface Processor (IP) allows a more conventional microprocessor to be interfaced as an Attached Processor (AP) to an iAPX 432 system. The AP acts as an intelligent I/O
Input/output
In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

 controller. The IP allows the AP to access objects in the iAPX 432 memory through the use of memory-mapped windows, but enforces the access rights applicable to the objects.

The IP provides five memory windows. Four are used to map objects for I/O operations; the fifth is the control window and is used by the AP to perform control operations such as requesting changes to the mapping of the other windows.

The IP also offers a special "physical" mode in which the AP has unrestricted access to the entire iAPX 432 address space. Physical mode is intended to be used only for system startup and debugger support.

External links