All Topics  
Microcode

 

   Email Print
   Bookmark   Link






 

Microcode



 
 
Microcode is a layer of lowest-level instructions involved in the implementation of machine code
Machine code

Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit. Machine code may be regarded as a primitive programming language or as the lowest-level representation of a compiled and/or assembly language computer program....
 instructions in many computers and other processors; it resides in a special high-speed memory and translates machine instructions into sequences of detailed circuit-level operations. It helps separate the machine instructions from the underlying electronics so that instructions can be designed and altered more freely.






Discussion
Ask a question about 'Microcode'
Start a new discussion about 'Microcode'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Microcode is a layer of lowest-level instructions involved in the implementation of machine code
Machine code

Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit. Machine code may be regarded as a primitive programming language or as the lowest-level representation of a compiled and/or assembly language computer program....
 instructions in many computers and other processors; it resides in a special high-speed memory and translates machine instructions into sequences of detailed circuit-level operations. It helps separate the machine instructions from the underlying electronics so that instructions can be designed and altered more freely. It also makes it feasible to build complex multi-step instructions while still reducing the complexity of the electronic circuitry compared to other methods. Writing microcode is called microprogramming and the microcode for a given processor is often called a microprogram.

The microcode is normally written by the CPU engineer during the design phase. It is generally not meant to be visible or changeable by a normal programmer, nor even an assembly
Assembly

Assembly may refer to:...
 programmer. Unlike machine code which often retains backwards compatibility, microcode only runs on the exact CPU model for which it's designed. Microcode can be used to let one microarchitecture
Microarchitecture

In computer engineering, microarchitecture is a description of the electrical circuitry of a computer, central processing unit, or digital signal processor that is sufficient for completely describing the operation of the hardware....
 emulate
Emulator

An emulator duplicates the functions of one system using a different system, so that the second system behaves like the first system. This focus on exact reproduction of external behavior is in contrast to some other forms of computer simulation, which can concern an abstract model of the system being simulated....
 another, usually more powerful, architecture.

Some hardware vendors, especially IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
, also use the term microcode as a synonym for firmware
Firmware

Firmware is a term sometimes used to denote the fixed, usually rather small, programs that internally control various electronic devices. Typical examples range from end user products such as remote controls or calculators, via computer parts and devices like harddisks, keyboard s, TFT screens or memory cards, all the way to scientific instr...
, whether or not it actually implements the microprogramming of a processor. Even simple firmware, such as the one used in a hard drive, is sometimes described as microcode. Such use is not discussed here.

Overview

The elements composing a microprogram exist on a lower conceptual level than a normal application program. Each element is differentiated by the "micro" prefix to avoid confusion: microinstruction, microassembler, microprogrammer, microarchitecture, etc.

The microcode usually does not reside in the main memory, but in a special high speed memory, called the control store
Control store

A control store is the part of a Central processing unit control unit that stores the CPU's microprogram. It is usually accessed by a microsequencer....
. It might be either read-only
Read-only memory

Read-only memory is a class of computer storage media used in computers and other electronic devices. Because data stored in ROM cannot be modified , it is mainly used to distribute firmware ....
 or read-write memory
Read-write memory

Read-write memory is a type of computer memory that may be relatively easily written to as well as read from. Compare read-only memory . The term RAM is often used to describe writable memory, but strictly speaking this is incorrect?RAM properly means memory that can be accessed at any location....
. In the latter case the microcode would be loaded into the control store from some other storage medium as part of the initialization of the CPU, and it could be altered to correct bugs in the instruction set, or to implement new machine instructions.

Microprograms consist of series of microinstructions. These microinstructions control the CPU at a very fundamental level of hardware circuitry. For example, a single typical microinstruction might specify the following operations:

  • Connect Register 1 to the "A" side of the ALU
    Arithmetic logic unit

    In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logicaloperations. The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers....
  • Connect Register 7 to the "B" side of the ALU
    Arithmetic logic unit

    In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logicaloperations. The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers....
  • Set the ALU to perform two's-complement addition
  • Set the ALU's carry input to zero
  • Store the result value in Register 8
  • Update the "condition codes" with the ALU status flags ("Negative", "Zero", "Overflow", and "Carry")
  • Microjump to MicroPC nnn for the next microinstruction


To simultaneously control all processor's features in one cycle, the microinstruction is often as wide as 50 or more bits. Microprograms are carefully designed and optimized for the fastest possible execution, since a slow microprogram would yield a slow machine instruction which would in turn cause all programs using that instruction to be slow.

The reason for microprogramming

Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially CPU instruction set
Instruction set

An instruction set is a list of all the instruction , and all their variations, that a processor can execute.Instructions include:* Arithmetic such as add and subtract...
s were "hard wired
Hardwired control

To execute instructions, a computer's processor must generate the control signals used to perform the processor's actions in the proper sequence. This sequence of actions can either be executed by another processor's software or in hardware....
". Each step needed to fetch, decode and execute the machine instructions (including any operand address calculations, reads and writes) was controlled directly by combinatorial logic and rather minimal sequential state machine circuitry. While very efficient, the need for powerful instruction sets with multi-step addressing and complex operations (see below) made such "hard-wired" processors difficult to design and debug; highly encoded and varied-length instructions can contribute to this as well, especially when very irregular encodings are used.

Microcode simplified the job by allowing much of the processor's behaviour and programming model be defined via microprogram routines rather than by dedicated circuitry. Even late in the design process, microcode could easily be changed, whereas hard wired CPU designs were very cumbersome to change, so this greatly facilitated CPU design.

In the 1940s through the late 1970s, much programming was done in assembly language
Assembly language

An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture....
; higher level instructions meant greater programmer productivity, so an important advantage of microcode was the relative ease by which powerful machine instructions could be defined. During the 1970s, CPU speeds grew more quickly than memory speeds and numerous techniques such as memory block transfer, memory pre-fetch and multi-level cache
Memory hierarchy

The hierarchical arrangement of computer storage in current computer architectures is called the memory hierarchy. It is designed to take advantage of memory locality in computer programs....
s were used to alleviate this. High level machine instructions, made possible by microcode, helped further, as fewer more complex machine instructions require less memory bandwidth. For example, an operation on a character string could be done as a single machine instruction, thus avoiding multiple instruction fetches.

Architectures with instruction sets implemented by complex microprograms included the IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
 System/360
System/360

The IBM System/360 is a mainframe computer system family announced by IBM on April 7, 1964. It was the first family of computers making a clear distinction between computer architecture and implementation, allowing IBM to release a suite of compatible designs at different price points....
 and Digital Equipment Corporation
Digital Equipment Corporation

Digital Equipment Corporation was a pioneering United States company in the computer industry. It is often referred to within the computing industry as DEC ....
 VAX
VAX

VAX was an instruction set architecture developed by Digital Equipment Corporation in the mid-1970s. A 32-bit complex instruction set computer ISA, it was designed to extend or replace DEC's various Programmed Data Processor ISAs....
. The approach of increasingly complex microcode-implemented instruction sets was later called CISC
Complex instruction set computer

A complex instruction set computer is a computer instruction set architecture in which each instruction can execute several low-level operations, such as a load from Memory , an arithmetic operator, and a memory , all in a single instruction....
. A middle way, used in many microprocessor
Microprocessor

A microprocessor incorporates most or all of the functions of a central processing unit on a single integrated circuit . The first microprocessors emerged in the early 1970s and were used for electronic calculators, using Binary-coded decimal arithmetic on 4-bit Word ....
s, is to use PLA
Programmable logic array

A programmable logic array is a programmable device used to implement combinational logic electrical network. The PLA has a set of programmable AND gate planes, which link to a set of programmable OR gate planes, which can then be conditionally complemented to produce an output....
s and/or ROM
Read-only memory

Read-only memory is a class of computer storage media used in computers and other electronic devices. Because data stored in ROM cannot be modified , it is mainly used to distribute firmware ....
s (instead of combinatorial logic) mainly for instruction decoding, and let a simple state machine (without much, or any, microcode) do most of the sequencing. The various practical uses of microcode and related techniques (such as PLAs) have been numerous over the years, as well as approaches to where, and to which extent, it should be used. It is still used in modern CPU designs.

Other benefits

A processor's microprograms operate on a more primitive, totally different and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it possible to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures.

Doing so is important if binary program compatibility is a priority. That way previously existing programs can run on totally new hardware without requiring revision and recompilation. However there may be a performance penalty for this approach. The tradeoffs between application backward compatibility vs CPU performance are hotly debated by CPU design engineers.

The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations actually use hardware that implemented a much simpler underlying microarchitecture; for example, the System/360 Model 30 had 8-bit data paths to the arithmetic logic unit
Arithmetic logic unit

In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logicaloperations. The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers....
 (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed core memory, and the System/360 Model 40 had 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory. The Model 50 and Model 65 had full 32-bit data paths and implemented the general-purpose registers in faster transistor circuits. In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduced the amount of unique system software that had to be written for each model.

A similar approach was used by Digital Equipment Corporation in their VAX family of computers. Initially a 32-bit TTL
Transistor-transistor logic

File:68k ttl.jpgTransistor?transistor logic is a class of digital circuits built from bipolar junction transistors and resistors. It is called transistor?transistor logic because both the logic gating function and the amplifying function are performed by transistors ....
 processor in conjunction with supporting microcode implemented the programmer-visible architecture. Later VAX versions used different microarchitectures, yet the programmer-visible architecture didn't change.

Microprogramming also reduced the cost of field changes to correct defects (bugs) in the processor; a bug could often be fixed by replacing a portion of the microprogram rather than by changes being made to hardware logic and wiring.

History


In 1947, the design of the MIT Whirlwind
Whirlwind (computer)

The Whirlwind computer was developed at the Massachusetts Institute of Technology. It is the first computer that operated in real time, used computer monitor for output, and the first that was not simply an electronic replacement of older mechanical systems....
 introduced the concept of a control store
Control store

A control store is the part of a Central processing unit control unit that stores the CPU's microprogram. It is usually accessed by a microsequencer....
 as a way to simplify computer design and move beyond ad hoc
Ad hoc

Ad hoc is a List of Latin phrases which means "for this [purpose]". It generally signifies a solution designed for a specific problem or task, non-generalisable and which cannot be adapted to other purposes....
 methods. The control store was a two-dimensional lattice: one dimension accepted "control time pulses" from the CPU's internal clock, and the other connected to control signals on gates and other circuits. A "pulse distributor" would take the pulses generated by the CPU clock and break them up into eight separate time pulses, each of which would activate a different row of the lattice. When the row was activated, it would activate the control signals connected to it.

Described another way, the signals transmitted by the control store are being played much like a player piano
Player piano

The player piano is a self-playing piano, containing a pneumatic mechanism that plays on the piano action pre-programmed music via perforated piano rolls....
 roll. That is, they are controlled by a sequence of very wide words constructed of bit
Bit

A bit is a binary numeral system numerical digit, taking a value of either 0 or 1. Binary digits are a basic unit of information Computer data storage and transmission in digital computing and digital information theory....
s, and they are "played" sequentially. In a control store, however, the "song" is short and repeated continuously.

In 1951 Maurice Wilkes enhanced this concept by adding conditional execution, a concept akin to a conditional
Conditional statement

In computer science, conditional statements, conditional expressions and conditional constructs are features of a programming language which perform different computations or actions depending on whether a programmer-specified condition evaluates to true or false ....
 in computer software. His initial implementation consisted of a pair of matrices, the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, as it were) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term microprogramming to describe this feature and distinguish it from a simple control store.

Examples of microprogrammed systems

  • Most models of the IBM System/360 series were microprogrammed:
  • The Model 25 was unique among System/360 models in using the top 16k bytes of core storage to hold the control storage for the microprogram. The 2025 used a 16-bit microarchitecture with seven control words (or microinstructions). At power up, or full system reset, the microcode was loaded from the card reader. The IBM 1410 emulation for this model was loaded this way.
  • The Model 30
    IBM 2030

    The IBM 2030, also called IBM System/360 model 30, was a popular IBM mainframe of the 1960s. It was one of the slowest models from IBM's System/360 line....
    , the slowest model in the line, used an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw was emulated by the microprogram. The microcode for this model was also held on special punched cards, which were stored inside the machine in a dedicated reader per card, called "CROS" units (Capacitor Read-Only Storage). A second CROS reader was installed for machines ordered with 1620 emulation.
  • The Model 40 used 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor). This model used "TROS" dedicated readers similar to "CROS" units, but with an inductive pickup (Transformer Read-only Store).
  • The Model 50 had two internal datapaths which operated in parallel: a 32-bit datapath used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store used 90-bit microinstructions.
  • The Model 85 had separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed with 108-bit control words.


  • The Digital Equipment Corporation PDP-11
    PDP-11

    The PDP-11 was a series of 16-bit minicomputers sold by Digital Equipment Corporation from 1970 into the 1990s. Though not explicitly conceived as successor to DEC's PDP-8 computer in the Programmed Data Processor series of computers , the PDP-11 replaced the PDP-8 in many Real-time computing....
     processors, with the exception of the PDP-11/20, were microprogrammed.
  • The Burroughs B700 "microprocessor" executed application-level opcodes using sequences of 16-bit microinstructions stored in main memory, each of these was either a register-load operation or mapped to a single 56-bit "nanocode" instruction stored in read-only memory. This allowed comparatively simple hardware to act either as a mainframe peripheral controller or to be packaged as a standalone computer.
  • The Burroughs B1700 was implemented with radically different hardware including bit-addressable main memory but had a similar multi-layer organisation.
  • The NCR 315
    NCR 315

    The NCR 315 Data Processing System, released in January 1962 by NCR Corporation, was a second generation computer. All printed circuit boards used resistor-transistor logic to create the various logic elements....
     was microprogrammed with hand wired ferrite cores (a ROM
    Rom

    ROM, Rom, or rom is an abbreviation and name that may refer to:...
    ) pulsed by a sequencer with conditional execution. Wires routed through the cores were enables for various data and logic elements in the processor.
  • In common with many other complex mechanical devices Charles Babbage's
    Charles Babbage

    Charles Babbage, Royal Society was an England mathematician, philosopher, inventor and mechanical engineer who originated the concept of a programmable computer....
     analytical engine
    Analytical engine

    The analytical engine, an important step in the history of computers, was the design of a mechanical general-purpose computer by the British mathematician Charles Babbage....
     used banks of cams to control each operation, i.e. it had a read-only control store. As such it deserves to be recognised as the first microprogrammed computer to be designed, even if it has not yet been realised in hardware.
  • The VU0 and VU1 vector units in the Sony Playstation 2 are microprogrammable; in fact, VU1 was only accessible via microcode for the first several generations of the SDK.


Implementation

Each microinstruction in a microprogram provides the bits which control the functional elements that internally compose a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less-complex programming challenge.

To take advantage of this, computers were divided into several parts:

A microsequencer
Microsequencer

In computer architecture and engineering, a sequencer or microsequencer is a part of the control unit of a Central processing unit. It generates the addresses used to step through the microprogram of a control store....
 picked the next word of the control store
Control store

A control store is the part of a Central processing unit control unit that stores the CPU's microprogram. It is usually accessed by a microsequencer....
. A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the instruction register
Instruction register

In computing, an instruction register is the part of a Central_processing_unit's control unit that stores the instruction currently being executed....
 and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store.

A register
Processor register

In computer architecture, a processor register is a small amount of Computer storage available on the CPU whose contents can be accessed more quickly than storage available elsewhere....
 set is a fast memory containing the data of the central processing unit. It may include the program counter, stack pointer, and other numbers that are not easily accessible to the application programmer. Often the register set is a triple-ported register file
Register file

A register file is an array of processor registers in a central processing unit. Modern integrated circuit-based register files are usually implemented by way of fast static RAMs with multiple ports....
, that is, two registers can be read, and a third written at the same time.

An arithmetic and logic unit performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.

There may also be a memory address register
Memory address register

The memory address register holds the address of the memory location where the next instruction is to be executed. While the first instruction is being executed, the address of the next memory location is held by it....
 and a memory data register
Memory data register

The memory data register is the processor register of a computer's control unit that contains the data to be stored in the computer storage , or the data after a fetch from the computer storage....
, used to access the main computer storage
Computer storage

Computer data storage, often called storage or memory, refers to computer components, devices, and recording medium that retain digital data used for computing for some interval of time....
.

Together, these elements form an "execution unit
Execution unit

In computer engineering, an execution unit is a part of a central processing unit that performs the operations and calculations called for by the computer program....
." Most modern CPUs
Central processing unit

A central processing unit is an electronic circuit that can execute computer programs. This broad definition can easily be applied to many early computers that existed long before the term "CPU" ever came into widespread usage....
 have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code.

These elements could often be bought together as a single chip. This chip came in a fixed width which would form a 'slice' through the execution unit. These were known as 'bit slice
Bit slicing

Bit slicing is a technique for constructing a central processing unit from modules of smaller bit width. Each of these components processes one bit field or "slice" of an operand....
' chips. The AMD Am2900
AMD Am2900

Am2900 is a family of integrated circuits created in 1975 by Advanced Micro Devices . They were constructed with bipolar junction transistor devices, in a Bit slicing topology, and were designed to be used as modular components each representing a different aspect of a computer control unit ....
 family is one of the best known examples of bit slice elements.

The parts of the execution units, and the execution units themselves are interconnected by a bundle of wires called a bus
Computer bus

In computer architecture, a bus is a subsystem that transfers data between computer components inside a computer or between computers. Each bus defines its set of connectors to physically plug devices, cards or cables together....
.

Programmers develop microprograms. The basic tools are software: A microassembler
Microassembler

A microassembler is a computer program that helps prepare a microcode to control the low level operation of a computer in much the same way an Assembly language#Assembler helps prepare higher level code for a central processing unit....
 allows a programmer to define the table of bits symbolically. A simulator program executes the bits in the same way as the electronics (hopefully), and allows much more freedom to debug the microprogram.

After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data. This program is similar to those used to optimize a programmable logic array
Programmable logic device

A programmable logic device or PLD is an electronics component used to build Reconfigurable Computing digital circuits. Unlike a logic gate, which has a fixed function, a PLD has an undefined function at the time of manufacture....
. No known computer program can produce optimal logic, but even pretty good logic can vastly reduce the number of transistors from the number required for a ROM control store. This reduces the cost and power used by a CPU.

Microcode can be characterized as horizontal or vertical. This refers primarily to whether each microinstruction directly controls CPU elements (horizontal microcode), or requires subsequent decoding by combinational logic
Combinational logic

In digital circuit theory, combinational logic is a type of logic circuit whose output is a pure function of the present input only. This is in contrast to sequential logic, in which the output depends not only on the present input but also on the history of the input....
 before doing so (vertical microcode). Consequently each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.

Horizontal microcode

Horizontal microcode is typically contained in a fairly wide control store, it is not uncommon for each word to be 56 bits or more. On each tick of a sequencer clock a microcode word is read, decoded, and used to control the functional elements which make up the CPU.

In a typical implementation a horizontal microprogram word comprises fairly tightly defined groups of bits. For example, one simple arrangement might be:

register source A register source B destination register arithmetic and logic unit operation type of jump jump address


For this type of micromachine to implement a JUMP instruction with the address following the opcode, the microcode might require two clock ticks; the engineer designing it would write microassembler source code looking something like this:

# Any line starting with a number-sign is a comment # This is just a label, the ordinary way assemblers symbolically represent a # memory address. InstructionJUMP: # To prepare for the next instruction, the instruction-decode microcode has already # moved the program counter to the memory address register. This instruction fetches # the target address of the jump instruction from the memory word following the # jump opcode, by copying from the memory data register to the memory address register. # This gives the memory system two clock ticks to fetch the next # instruction to the memory data register for use by the instruction decode. # The sequencer instruction "next" means just add 1 to the control word address. MDR, NONE, MAR, COPY, NEXT, NONE # This places the address of the next instruction into the PC. # This gives the memory system a clock tick to finish the fetch started on the # previous microinstruction. # The sequencer instruction is to jump to the start of the instruction decode. MAR, 1, PC, ADD, JMP, InstructionDecode # The instruction decode is not shown, because it's usually a mess, very particular # to the exact processor being emulated. Even this example is simplified. # Many CPUs have several ways to calculate the address, rather than just fetching # it from the word following the op-code. Therefore, rather than just one # jump instruction, those CPUs have a family of related jump instructions.

For each tick it is common to find that only some portions of the CPU are used, with the remaining groups of bits in the microinstruction being no-ops. With careful design of hardware and microcode this property can be exploited to parallelise operations which use different areas of the CPU, for example in the case above the ALU is not required during the first tick so it could potentially be used to complete an earlier arithmetic instruction.

Vertical microcode

In vertical microcode, each microinstruction is encoded -- that is, the bit fields may pass through intermediate combinatory logic which in turn generates the actual control signals for internal CPU elements (ALU, registers, etc.). In contrast, with horizontal microcode the bit fields themselves directly produce the control signals. Consequently vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock.

Some vertical microcodes are just the assembly language of a simple conventional computer that is emulating a more complex computer. This technique was popular in the time of the PDP-8
PDP-8

The PDP-8 was the first successful commercial minicomputer, produced by Digital Equipment Corporation in the 1960s. DEC introduced it on 22 March 1965, and sold more than 50,000 systems, the most of any computer up to that date....
. Another form of vertical microcode has two fields:
field select field value


The "field select" selects which part of the CPU will be controlled by this word of the control store. The "field value" actually controls that part of the CPU. With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store; however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction.

As transistors became cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode no longer being used.

Writable control stores

A few computers were built using "writable microcode" -- rather than storing the microcode in ROM or hard-wired logic, the microcode was stored in a RAM called a Writable Control Store or WCS. Such a computer is sometimes called a Writable Instruction Set Computer or WISC. Many of these machines were experimental laboratory prototypes, such as the WISC CPU/16 and the RTX 32P.

There were also commercial machines that used writable microcode, such as early Xerox
Xerox PARC

PARC , formerly Xerox PARC, is a research and development company in Palo Alto, California with a distinguished reputation for its contributions to information technology....
 workstations, the DEC
Digital Equipment Corporation

Digital Equipment Corporation was a pioneering United States company in the computer industry. It is often referred to within the computing industry as DEC ....
 VAX
VAX

VAX was an instruction set architecture developed by Digital Equipment Corporation in the mid-1970s. A 32-bit complex instruction set computer ISA, it was designed to extend or replace DEC's various Programmed Data Processor ISAs....
 8800 ("Nautilus") family, the Symbolics
Symbolics

Symbolics refers to two companies: now-defunct computer manufacturer Symbolics, Inc., and a privately-held company that acquired the assets of the former company and continues to sell and maintain the Open Genera Lisp system and the Macsyma computer algebra system....
 L- and G-machines, and a number of IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
 System/370
System/370

The IBM System/370 was a model range of IBM mainframes announced on June 30, 1970 as the successors to the System/360 family. The series maintained backward compatibility with the S/360, allowing an easy migration path for customers; this, plus improved performance, were the dominant themes of the product announcement....
 implementations. Some DEC PDP-10
PDP-10

The PDP-10 was a mainframe computer manufactured by Digital Equipment Corporation from the late 1960s on; the name stands for "Programmed Data Processor model 10"....
 machines stored their microcode in SRAM chips (about 80 bits wide x 2 Kwords), which was typically loaded on power-on through some other front-end CPU. Many more machines offered user-programmable writeable control stores as an option (including the HP
Hewlett-Packard

The Hewlett-Packard Company , commonly referred to as HP, is a technology corporation headquartered in Palo Alto, California, United States....
 2100
HP 2100

The HP 2100 was a series of minicomputers produced by Hewlett-Packard from the mid 1960s to early 1990s. The 2100 was also a specific model in this series....
 and DEC PDP-11/60
PDP-11

The PDP-11 was a series of 16-bit minicomputers sold by Digital Equipment Corporation from 1970 into the 1990s. Though not explicitly conceived as successor to DEC's PDP-8 computer in the Programmed Data Processor series of computers , the PDP-11 replaced the PDP-8 in many Real-time computing....
 minicomputer
Minicomputer

A minicomputer is a class of multi-user computers that lies in the middle range of the computing spectrum, in between the largest multi-user systems and the smallest single-user systems ....
s). WCS offered several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs could provide. User-programmable WCS allowed the user to optimize the machine for specific purposes.

Some CPU designs compile the instruction set to a writable RAM
Ram

Ram, ram, or RAM as a non-acronymic wordAs a non-acronymic word Ram, ram, or RAM may refer to:...
 or FLASH
Flash memory

Flash memory is a non-volatile memory computer storage that can be electrically erased and reprogrammed. It is a technology that is primarily used in memory cards and USB flash drives for general storage and transfer of data between computers and other digital products....
 inside the CPU (such as the Rekursiv
Rekursiv

Rekursiv was a computer processor designed by David M. Harland in the mid-1980s for Linn Smart Computing in Glasgow, Scotland. It was one of the few computer architectures intended to implement object-oriented concepts directly in hardware....
 processor and the Imsys Cjip), or an FPGA (reconfigurable computing
Reconfigurable computing

Reconfigurable computing is a computing paradigm combining some of the flexibility of software with the high performance of hardware by processing with very flexible high speed computing fabrics like FPGAs....
). The Western Digital
Western Digital

Western Digital Corporation is a manufacturer of computer hard disk drives, and has a long history in the electronics industry as an integrated circuit maker and a storage products company....
 MCP-1600
MCP-1600

The MCP-1600 was a multi-chip microprocessor made by Western Digital in the late 1970s through the early 1980s. Used in the Pascal MicroEngine, the original AlphaMicro system, and one variant of the DEC PDP-11#The_LSI-11 microcomputer....
 is an older example, using a dedicated, separate ROM for microcode.

A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some CISC
Complex instruction set computer

A complex instruction set computer is a computer instruction set architecture in which each instruction can execute several low-level operations, such as a load from Memory , an arithmetic operator, and a memory , all in a single instruction....
 processors include instructions that can take a very long time to execute. Such variations interfere with both interrupt
Interrupt

In computing, an interrupt is an asynchronous communication signal from hardware indicating the need for attention or a synchronous event in software indicating the need for a change in execution....
 latency
Latency (engineering)

Latency is a time delay between the moment something is initiated, and the moment one of its effects begins or becomes detectable. The word derives from the fact that during the period of latency the effects of an action are latent, meaning "potential" or "not yet observed"....
 and, what is far more important in modern systems, pipelining.

Several Intel CPUs in the IA32 architecture family have writable microcode. This has allowed bugs in the Intel Core 2
Intel Core 2

The Core 2 brand refers to a range of Intel's consumer 64-bit single- and dual-core and 2x2 Multi-Chip Module quad-core CPUs with the x86-64 instruction set, based on the Intel Core microarchitecture, derived from the 32-bit dual-core Intel Core laptop processor....
 microcode and Intel Xeon microcode to be fixed in software, rather than requiring the entire chip to be replaced. Such fixes can be installed by Linux, Microsoft Windows, or the motherboard BIOS.

Risks

Linux (on x86 PCs) has a patch program that fixes botched CPU microcode. Of all UNIX (and UNIX-like) operating systems on Intel (and Intel x86-compatible) PCs there has been an ongoing requirement to patch erroneous microcode since the FPU multiplier problem
Pentium FDIV bug

The Pentium FDIV bug was a computer bug in Intel's original Pentium floating point unit. Certain floating point division operations performed with these processors would produce incorrect results....
 that was endemic to some Pentiums.
  • Microsoft Windows also has similar patches, but does generally not label them as such since Windows XP.
  • So far only x86 CPUs have microcode patches. This is unknown with RISC CPUs as well as general purpose DSPs.


Microcode versus VLIW and RISC

The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the RISC design philosophy started becoming more prominent. This included the points:

  • Analysis shows complex instructions are rarely used, hence the machine resources devoted to them are largely wasted.
  • Programming has largely moved away from assembly level, so it's no longer worthwhile to provide complex instructions for productivity reasons.
  • The machine resources devoted to rarely-used complex instructions is better used for expediting performance of simpler, commonly-used instructions.
  • Complex microcoded instructions requiring many, varying clock cycles are difficult to pipeline for increased performance.
  • Simpler instruction sets allow direct execution by hardware, avoiding the performance penalty of microcoded execution.


It should be mentioned that there are counter-points as well:
  • The complex instructions in heavily microcoded implementations may not take much extra machine resources (except microcode space); for instance, the same ALU is often used to calculate an effective address as well as computing the result from the actual operands.
  • Non-RISC instructions, i.e. involving direct memory operand
    Operand

    An operand is one of the inputs of an operator in mathematics. The following arithmetic expression shows an example of operators and operands:...
    s are frequently used by modern compilers, even immediate to stack (i.e. memory result) arithmetic operations are commonly employed. Although such memory operations, often with varying length encodings (i.e. the "CISC" characteristics), are more difficult to pipeline, it is still fully feasible, clearly exemplified by the Intel 486, Cyrix 6x86
    Cyrix 6x86

    The Cyrix 6x86 is a sixth-generation, 32-bit 80x86-compatible microprocessor designed by Cyrix and manufactured by International Business Machines and SGS-Thomson....
    , etc.
  • Non-RISC instructions inherently perform more work per instruction (on average), and are also normally highly encoded, so they enable smaller overall size of the same program, and thus better use of limited cache memories.
  • Modern CISC implementations, most notably the x86, implement most instructions and all addressing modes "in hardware"; microcode is still used however, for some really complex, or very special, instructions (such as CPUID
    CPUID

    The CPUID opcode is a processor supplementary instruction for the x86 architecture. It was introduced by Intel in the early 1990s for later steppings of the Intel 80486 chip, and fully rolled out at the introduction of the Pentium MMX processor....
    ), as well as for internal "housekeeping".


Many RISC and VLIW
Very long instruction word

Very Long Instruction Word or VLIW refers to a Central processing unit architecture designed to take advantage of instruction level parallelism ....
 processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW
Very long instruction word

Very Long Instruction Word or VLIW refers to a Central processing unit architecture designed to take advantage of instruction level parallelism ....
 processors have instructions that behave similarly to very wide horizontal microcode, although typically without such fine-grained control over the hardware as provided by microcode. RISC instructions are sometimes similar to the narrow vertical microcode.

See also

  • Firmware
    Firmware

    Firmware is a term sometimes used to denote the fixed, usually rather small, programs that internally control various electronic devices. Typical examples range from end user products such as remote controls or calculators, via computer parts and devices like harddisks, keyboard s, TFT screens or memory cards, all the way to scientific instr...
  • Control unit
    Control unit

    A control unit in general is a central part of whatsoever machinery that controls its operation, provided that a piece of machinery is complex and organized enough to contain any such unit....
  • Finite state machine
    Finite state machine

    A finite state machine or finite state automaton or simply a state machine, is a model of behavior composed of a finite number of state s, transitions between those states, and actions....
  • Microsequencer
    Microsequencer

    In computer architecture and engineering, a sequencer or microsequencer is a part of the control unit of a Central processing unit. It generates the addresses used to step through the microprogram of a control store....
  • Microassembler
    Microassembler

    A microassembler is a computer program that helps prepare a microcode to control the low level operation of a computer in much the same way an Assembly language#Assembler helps prepare higher level code for a central processing unit....
  • Control store
    Control store

    A control store is the part of a Central processing unit control unit that stores the CPU's microprogram. It is usually accessed by a microsequencer....
  • Execution unit
    Execution unit

    In computer engineering, an execution unit is a part of a central processing unit that performs the operations and calculations called for by the computer program....
  • Arithmetic logic unit
    Arithmetic logic unit

    In computing, an arithmetic logic unit is a digital circuit that performs arithmetic and logicaloperations. The ALU is a fundamental building block of the central processing unit of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers....
  • Floating point unit
    Floating point unit

    A floating-point unit is a part of a computer system specially designed to carry out operations on floating point numbers. Typical operations are addition, subtraction, multiplication, division , and square root....
  • Instruction pipeline
    Instruction pipeline

    File:5 Stage Pipeline.svgAn instruction pipeline is a technique used in the design of computers and other digital electronic devices to increase their instruction throughput ....
  • Superscalar
    Superscalar

    A superscalar Central processing unit architecture implements a form of parallel computer called instruction level parallelism within a single processor....
  • Microarchitecture
    Microarchitecture

    In computer engineering, microarchitecture is a description of the electrical circuitry of a computer, central processing unit, or digital signal processor that is sufficient for completely describing the operation of the hardware....
  • CPU design
    CPU design

    CPU design is the design engineering task of creating a central processing unit , a component of computer hardware. It is a subfield of electronics engineering and computer engineering....


Further reading

  • Tucker, S. G., IBM Systems Journal, Volume 6, Number 4, pp.222-241 (1967)


External links