INMOS transputer
Encyclopedia
The transputer was a pioneering microprocessor
Microprocessor
A microprocessor incorporates the functions of a computer's central processing unit on a single integrated circuit, or at most a few integrated circuits. It is a multipurpose, programmable device that accepts digital data as input, processes it according to instructions stored in its memory, and...

 architecture of the 1980s, featuring integrated memory and serial communication
Serial communication
In telecommunication and computer science, serial communication is the process of sending data one bit at a time, sequentially, over a communication channel or computer bus. This is in contrast to parallel communication, where several bits are sent as a whole, on a link with several parallel channels...

 links, intended for parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

. It was designed and produced by Inmos
INMOS
Inmos Limited was a British semiconductor company, founded by Iann Barron, with both the head office and the design office at Aztec West in Bristol, it was incorporated in November 1978.- Products :...

, a British
United Kingdom
The United Kingdom of Great Britain and Northern IrelandIn the United Kingdom and Dependencies, other languages have been officially recognised as legitimate autochthonous languages under the European Charter for Regional or Minority Languages...

 semiconductor
Semiconductor device
Semiconductor devices are electronic components that exploit the electronic properties of semiconductor materials, principally silicon, germanium, and gallium arsenide, as well as organic semiconductors. Semiconductor devices have replaced thermionic devices in most applications...

 company based in Bristol
Bristol
Bristol is a city, unitary authority area and ceremonial county in South West England, with an estimated population of 433,100 for the unitary authority in 2009, and a surrounding Larger Urban Zone with an estimated 1,070,000 residents in 2007...

.

For some time in the late 1980s many considered the transputer to be the next great design for the future of computing. While Inmos and the transputer did not ultimately live up to this expectation, the transputer architecture was highly influential in provoking new ideas in computer architecture
Computer architecture
In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems....

, several of which have re-emerged in different forms in modern systems.

Background

In the early 1980s, conventional CPUs
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

 appeared to reach a performance limit. Up to that time, manufacturing difficulties limited the amount of circuitry designers could place on a chip. Continued improvements in the fabrication process, however, removed this restriction. Soon the problem became that the chips could hold more circuitry than the designers knew how to use. Traditional CISC
Complex instruction set computer
A complex instruction set computer , is a computer where single instructions can execute several low-level operations and/or are capable of multi-step operations or addressing modes within single instructions...

 designs were reaching a performance plateau, and it wasn't clear it could be overcome.

It seemed that the only way forward was to increase the use of parallelism, the use of several CPUs that would work together to solve several tasks at the same time. This depended on the machines in question being able to run several tasks at once, a process known as multitasking
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, share common processing resources such as a CPU. In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is actively executing instructions for...

. This had generally been too difficult for previous CPU designs to handle, but more recent designs were able to accomplish it effectively. It was clear that in the future this would be a feature of all operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s.

A side effect of most multitasking design is that it often also allows the processes to be run on physically different CPUs, in which case it is known as multiprocessing
Multiprocessing
Multiprocessing is the use of two or more central processing units within a single computer system. The term also refers to the ability of a system to support more than one processor and/or the ability to allocate tasks between them...

. A low-cost CPU built with multiprocessing in mind could allow the speed of a machine to be increased by adding more CPUs, potentially far more cheaply than by using a single faster CPU design.

The first transputer designs were due to David May
David May (computer scientist)
Michael David May, born February 24, 1951, is a British computer scientist. He is Professor of Computer Science at the University of Bristol and founder and Chief Technology Officer of XMOS Semiconductor.May was lead architect for the transputer...

 and Robert Milne. In 1990, May received an Honorary DSc from University of Southampton
University of Southampton
The University of Southampton is a British public university located in the city of Southampton, England, a member of the Russell Group. The origins of the university can be dated back to the founding of the Hartley Institution in 1862 by Henry Robertson Hartley. In 1902, the Institution developed...

, followed in 1991 by his election as a Fellow of The Royal Society and the award of the Patterson Medal of the Institute of Physics
Institute of Physics
The Institute of Physics is a scientific charity devoted to increasing the practice, understanding and application of physics. It has a worldwide membership of around 40,000....

 in 1992.
Tony Fuge, a leading engineer at Inmos at the time, was awarded the Prince Philip Designers Prize in 1987 for his work on the T414 transputer.

Design

The transputer (the name deriving from transistor and computer) was the first general purpose microprocessor designed specifically to be used in parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

 systems. The goal was to produce a family of chips ranging in power and cost that could be wired together to form a complete parallel computer. The name was selected to indicate the role the individual transputers would play: numbers of them would be used as basic building blocks, just as transistor
Transistor
A transistor is a semiconductor device used to amplify and switch electronic signals and power. It is composed of a semiconductor material with at least three terminals for connection to an external circuit. A voltage or current applied to one pair of the transistor's terminals changes the current...

s had earlier.

Originally the plan was to make the transputer cost only a few dollars per unit. Inmos saw them being used for practically everything, from operating as the main CPU for a computer to acting as a channel controller for disk drives in the same machine. Spare cycles on any of these transputers could be used for other tasks, greatly increasing the overall performance of the machines.

Even a single transputer would have all the circuitry needed to work by itself, a feature more commonly associated with microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

s. The intention was to allow transputers to be connected together as easily as possible, without the requirement for a complex bus (or motherboard
Motherboard
In personal computers, a motherboard is the central printed circuit board in many modern computers and holds many of the crucial components of the system, providing connectors for other peripherals. The motherboard is sometimes alternatively known as the mainboard, system board, or, on Apple...

). Power and a simple clock signal had to be supplied, but little else: RAM, a RAM controller, bus support and even an RTOS were all built in.

Architecture

The original transputer used a very simple and rather unique architecture to achieve a high performance in a small area. It used microcode
Microcode
Microcode is a layer of hardware-level instructions and/or data structures involved in the implementation of higher level machine code instructions in many computers and other processors; it resides in special high-speed memory and translates machine instructions into sequences of detailed...

 as the principal method of controlling the data path but unlike other designs of the time, many instructions took only a single cycle to execute. Instruction opcodes were used as the entry points to the microcode ROM and the outputs from the ROM were fed directly to the data path. For multi-cycle instructions, while the data path was performing the first cycle, the microcode decoded four possible options for the second cycle. The decision as to which of these options would actually be used could be made near the end of the first cycle. This allowed for very fast operation while keeping the architecture generic.

The clock speed of 20 MHz was quite high for the era and the designers were very concerned about the practicalities of distributing a clock signal of this speed on a board. A lower external clock of 5 MHz was used and this was multiplied up to the required internal frequency using a phase-locked loop
Phase-locked loop
A phase-locked loop or phase lock loop is a control system that generates an output signal whose phase is related to the phase of an input "reference" signal. It is an electronic circuit consisting of a variable frequency oscillator and a phase detector...

 (PLL). The internal clock actually had four non-overlapping phases and designers were free to use whichever combination of these they wanted so it could be argued that the transputer actually ran at 80 MHz. Dynamic logic
Dynamic logic (digital logic)
In integrated circuit design, dynamic logic is a design methodology in combinatorial logic circuits, particularly those implemented in MOS technology. It is distinguished from the so-called static logic by exploiting temporary storage of information in stray and gate capacitances...

 was used in many parts of the design to reduce area and increase speed. Unfortunately, these techniques are difficult to combine with automatic test pattern generation
Automatic test pattern generation
ATPG is an electronic design automation method/technology used to find an input sequence that, when applied to a digital circuit, enables automatic test equipment to distinguish between the correct circuit behavior and the faulty circuit...

 scan testing so they fell out of favour for later designs.

Links

The basic design of the transputer included serial links that allowed it to communicate with up to four other transputers, each at 5, 10 or 20 Mbit/s – which was very fast for the 1980s. Any number of transputers could be connected together over even longish links (tens of metres) to form a single computing "farm". A hypothetical desktop machine might have two of the "low end" transputers handling I/O
Input/output
In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

 tasks on some of their serial lines (hooked up to appropriate hardware) while they talked to one of their larger cousins acting as a CPU
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

 on another.

There were limits to the size of a system that could be built in this fashion. Since each transputer was linked to another in a fixed point-to-point layout, sending messages to a more distant transputer required the messages to be relayed by each chip on the line. This introduced a delay with every "hop" over a link, leading to long delays on large nets. To solve this problem Inmos also provided a zero-delay switch that connected up to 32 transputers (or switches) into even larger networks.

Booting

Transputers could be booted over the network links
Network booting
Network booting is the process of booting a computer from a network rather than a local drive. This method of booting can be used by routers, diskless workstations and centrally managed computers such as public computers at libraries and schools...

 (as opposed to the memory as in most machines) so a single transputer could start up the entire network. There was a pin called BootFromROM that when asserted caused the transputer to start two bytes from the top of memory (sufficient for up to a 256 byte backward jump, usually out of ROM). When this pin was not asserted, the first byte that arrived down any link was the length of a bootstrap to be downloaded, which was placed in low memory and run. The 'special' lengths of 0 and 1 were reserved for PEEK and POKE
PEEK and POKE
In computing, PEEK is a BASIC programming language extension used for reading the contents of a memory cell at a specified address. The corresponding command to set the contents of a memory cell is POKE.-Statement syntax:...

 – allowing inspection and changing of RAM in an unbooted transputer. After a peek (which required an address) or a poke (which took a word address, and a word of data – 16 or 32 bit depending on the basic word width of the transputer variant) the transputer would return to waiting for a bootstrap.

Scheduler

Supporting the links was additional circuitry that handled scheduling of the traffic over them. Processes waiting on communications would automatically pause while the networking circuitry finished its reads or writes. Other processes running on the transputer would then be given that processing time. It included two priority level
Priority level
Priority level or priority, in the Telecommunications Service Priority system, is the level that may be assigned to an NS/EP telecommunications service, which level specifies the order in which provisioning or restoration of the service is to occur relative to other NS/EP or non-NS/EP...

s to improve real-time
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

 and multiprocessor
Multiprocessor
Computer system having two or more processing units each sharing main memory and peripherals, in order to simultaneously process programs.Sometimes the term Multiprocessor is confused with the term Multiprocessing....

 operation. The same logical system was used to communicate between programs running on a single transputer, implemented as "virtual network links" in memory. So programs asking for any input or output automatically paused while the operation completed, a task that normally required the operating system to handle as the arbiter of hardware. Operating systems on the transputer did not have to handle scheduling: in fact, one could consider the chip itself to have an OS inside it.

To include all this functionality on a single chip, the transputer's core logic was simpler than most CPUs. While some have called it a RISC due to its rather spare nature (and because that was a desirable marketing buzzword
Buzzword
A buzzword is a term of art, salesmanship, politics, or technical jargon that is used in the media and wider society outside of its originally narrow technical context....

 at the time), it was heavily microcode
Microcode
Microcode is a layer of hardware-level instructions and/or data structures involved in the implementation of higher level machine code instructions in many computers and other processors; it resides in special high-speed memory and translates machine instructions into sequences of detailed...

d, had a limited register set, and complex memory-to-memory instructions, all of which place it firmly in the CISC
Complex instruction set computer
A complex instruction set computer , is a computer where single instructions can execute several low-level operations and/or are capable of multi-step operations or addressing modes within single instructions...

 camp. Unlike register-heavy load-store RISC CPUs, the transputer had only three data registers, which behaved as a stack. In addition a Workspace Pointer pointed to a conventional memory stack, easily accessible via the Load Local and Store Local instructions. This allowed for very fast context switch
Context switch
A context switch is the computing process of storing and restoring the state of a CPU so that execution can be resumed from the same point at a later time. This enables multiple processes to share a single CPU. The context switch is an essential feature of a multitasking operating system...

ing by simply changing the workspace pointer to the memory used by another process (a technique used in a number of contemporary designs, such as the TMS9900). The three register stack contents were not preserved past certain instructions, like Jump, when the transputer could do a context switch.

Instruction set

The transputer instruction set comprised 8-bit instructions divided into opcode
Opcode
In computer science engineering, an opcode is the portion of a machine language instruction that specifies the operation to be performed. Their specification and format are laid out in the instruction set architecture of the processor in question...

 and operand
Operand
In mathematics, an operand is the object of a mathematical operation, a quantity on which an operation is performed.-Example :The following arithmetic expression shows an example of operators and operands:3 + 6 = 9\;...

 nibble
Nibble
In computing, a nibble is a four-bit aggregation, or half an octet...

s. The "upper" nibble contained the 16 possible primary instruction codes, making it one of the very few commercialized minimal instruction set computer
Minimal instruction set computer
Minimal Instruction Set Computer is a processor architecture with a very small number of basic operations and corresponding opcodes. Such instruction sets are commonly stack based rather than register based to reduce the size of operand specifiers. Such a stack machine architecture is inherently...

s. The "lower" nibble contained the single immediate constant operand, commonly used as an offset relative to the Workspace (memory stack) pointer. Two prefix
Prefix
A prefix is an affix which is placed before the root of a word. Particularly in the study of languages,a prefix is also called a preformative, because it alters the form of the words to which it is affixed.Examples of prefixes:...

 instructions allowed construction of larger constants by prepending their lower nibbles to the operands of following instructions. Additional instructions were supported via the Operate (Opr) instruction code, which decoded the constant operand as an extended zero-operand opcode, providing for almost endless and easy instruction set expansion as newer implementations of the transputer were introduced.

The 16 'primary' one-operand instructions were:
Mnemonic Description
J Jump — add immediate operand to instruction pointer.
LDLP Load Local Pointer — load a Workspace-relative pointer onto the top of the register stack
PFIX Prefix — general way to increase lower nibble of following primary instruction
LDNL Load non-local — load a value offset from address at top of stack
LDC Load constant — load constant operand onto the top of the register stack
LDNLP Load Non-local pointer — Load address, offset from top of stack
NFIX Negative prefix — general way to negate (and possibly increase) lower nibble
LDL Load Local — load value offset from Workspace
ADC Add Constant — add constant operand to top of register stack
CALL Subroutine call — push instruction pointer and jump
CJ Conditional jump — depending on value at top of register stack
AJW Adjust workspace — add operand to workspace pointer
EQC Equals constant — test if top of register stack equals constant operand
STL Store local — store at constant offset from workspace
STNL Store non-local — store at address offset from top of stack
OPR Operate — general way to extend instruction set


All these instructions take a constant, representing an offset or an arithmetic constant. If this constant was less than 16, all these instructions coded to a single byte.

The first 16 'secondary' zero-operand instructions (using the OPR primary instruction) were:
Mnemonic Description
REV Reverse — swap two top items of register stack
LB Load byte
BSUB Byte subscript
ENDP End process
DIFF Difference
ADD Add
GCALL General Call — swap top of stack and instruction pointer
IN Input — receive message
PROD Product
GT Greater Than — the only comparison instruction
WSUB Word subscript
OUT Output — send message
SUB Subtract
STARTP Start Process
OUTBYTE Output Byte — send single-byte message
OUTWORD Output word — send single-word message

TRAMs

To provide an easy means of prototyping, constructing and configuring multiple-transputer systems, Inmos introduced the TRAM (TRAnsputer Module) standard in 1987. A TRAM was essentially a building block daughterboard
Daughterboard
A daughterboard, daughtercard or piggyback board is a circuit board meant to be an extension or "daughter" of a motherboard , or occasionally of another card...

 comprising a transputer and, optionally, external memory and/or peripheral devices, with simple standardised connectors providing power, transputer links, clock and system signals. Various sizes of TRAM were defined, from the basic Size 1 TRAM (3.66 in by 1.05 in) up to Size 8 (3.66 in by 8.75 in). Inmos produced a range of TRAM motherboards for various host buses such as ISA
Industry Standard Architecture
Industry Standard Architecture is a computer bus standard for IBM PC compatible computers introduced with the IBM Personal Computer to support its Intel 8088 microprocessor's 8-bit external data bus and extended to 16 bits for the IBM Personal Computer/AT's Intel 80286 processor...

, MicroChannel
Microchannel
Microchannel can refer to* Basic structure used in microtechnology, see Microchannel .* Micro Channel architecture in computing...

 or VMEbus
VMEbus
VMEbus is a computer bus standard, originally developed for the Motorola 68000 line of CPUs, but later widely used for many applications and standardized by the IEC as ANSI/IEEE 1014-1987. It is physically based on Eurocard sizes, mechanicals and connectors , but uses its own signalling system,...

. TRAM links operate at 10 Mbits/s or 20 Mbits/s.

Software

Transputers were intended to be programmed using the occam programming language
Occam (programming language)
occam is a concurrent programming language that builds on the Communicating Sequential Processes process algebra, and shares many of its features. It is named after William of Ockham of Occam's Razor fame....

, based on the CSP
Communicating sequential processes
In computer science, Communicating Sequential Processes is a formal language for describing patterns of interaction in concurrent systems. It is a member of the family of mathematical theories of concurrency known as process algebras, or process calculi...

 process calculus
Process calculus
In computer science, the process calculi are a diverse family of related approaches for formally modelling concurrent systems. Process calculi provide a tool for the high-level description of interactions, communications, and synchronizations between a collection of independent agents or processes...

. In fact it is fair to say that the transputer was built specifically to run occam
Occam (programming language)
occam is a concurrent programming language that builds on the Communicating Sequential Processes process algebra, and shares many of its features. It is named after William of Ockham of Occam's Razor fame....

, even more so than contemporary CISC
Complex instruction set computer
A complex instruction set computer , is a computer where single instructions can execute several low-level operations and/or are capable of multi-step operations or addressing modes within single instructions...

 designs were built to run languages like Pascal or C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

. Occam supported concurrency
Concurrency (computer science)
In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other...

 and channel-based inter-process or inter-processor communication as a fundamental part of the language. With the parallelism and communications built into the chip and the language interacting with it directly, writing code for things like device controllers became a triviality – even the most basic code could watch the serial ports for I/O, and would automatically sleep when there was no data.

The initial occam development environment for the transputer was the Inmos D700 Transputer Development System (TDS). This was an unorthodox integrated development environment incorporating an editor, compiler, linker and (post-mortem) debugger. The TDS was itself a transputer application written in occam. The TDS text editor was notable in that it was a folding editor
Folding editor
A folding editor is a text editor which supports text folding or code folding, a mechanism allowing the user to hide and reveal blocks of text—usually named...

, allowing blocks of code to be hidden and revealed, to make the structure of the code more apparent. Unfortunately, the combination of an unfamiliar programming language and equally unfamiliar development environment did nothing for the early popularity of the transputer. Later, Inmos would release more conventional occam cross-compilers, the occam 2 Toolsets.

Implementations of more mainstream programming languages, such as C, FORTRAN
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

, Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

 and Pascal were also later released by both Inmos and third-party vendors. These usually included language extensions or libraries providing, in a less elegant way, occam-like concurrency and channel-based communication.

The transputer's lack of support for virtual memory inhibited the porting of mainstream variants of the UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 operating system, though ports of UNIX-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 operating systems (such as Minix
Minix
MINIX is a Unix-like computer operating system based on a microkernel architecture created by Andrew S. Tanenbaum for educational purposes; MINIX also inspired the creation of the Linux kernel....

 and Idris
Idris (operating system)
Idris is a multi-tasking, Unix-like, multi-user, real-time operating system released by Whitesmiths, of Westford, Massachusetts. The product was commercially available from 1979 through 1988.-Background:...

 from Whitesmiths
Whitesmiths
Whitesmiths Ltd. was a software company based in Westford, Massachusetts. It sold a Unix-like operating system called Idris, as well as the first commercial C compiler...

) were produced. An advanced UNIX-like distributed operating system
Distributed operating system
A distributed operating system is the logical aggregation of operating system software over a collection of independent, networked, communicating, and spatially disseminated computational nodes. Individual system nodes each hold a discrete software subset of the global aggregate operating system...

, HeliOS
HeliOS
HeliOS was a Unix-like operating system for parallel computers developed and sold by Perihelion Software. It was most commonly used on various Transputer systems, but also supported other architectures. The system provided a micro-kernel that implemented a distributed name space and messaging...

, was also designed specifically for multi-transputer systems by Perihelion Software
Perihelion Software
Perihelion Software was a United Kingdom company founded in 1986 by Dr. Tim King along with a number of colleagues who had all worked together at MetaComCo on AmigaOS and written compilers for both the Amiga and the Atari ST....

.

Implementations

The first transputers were announced in 1983 and released in 1984.

In keeping with their role as microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

-like devices, they included on-board RAM
Ram
-Animals:*Ram, an uncastrated male sheep*Ram cichlid, a species of freshwater fish endemic to Colombia and Venezuela-Military:*Battering ram*Ramming, a military tactic in which one vehicle runs into another...

 and a built-in RAM controller which enabled more memory to be added without any additional hardware. Unlike other designs, transputers did not include I/O lines: these were to be added with hardware attached to the existing serial links. There was one 'Event' line, similar to a conventional processor's interrupt line. Treated as a channel, a program could 'input' from the event channel, and proceed only after the event line was asserted.

All transputers ran from an external 5 MHz clock input; this was multiplied to provide the processor clock.

The transputer did not include an MMU
Memory management unit
A memory management unit , sometimes called paged memory management unit , is a computer hardware component responsible for handling accesses to memory requested by the CPU...

 or a virtual memory
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

 system.

Transputer variants (excepting the cancelled T9000) can be categorised into three groups: the 16-bit
16-bit
-16-bit architecture:The HP BPC, introduced in 1975, was the world's first 16-bit microprocessor. Prominent 16-bit processors include the PDP-11, Intel 8086, Intel 80286 and the WDC 65C816. The Intel 8088 was program-compatible with the Intel 8086, and was 16-bit in that its registers were 16...

 T2 series, the 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 T4 series and the 32-bit T8 series with 64-bit IEEE 754 floating-point support.

T2: 16-bit

The prototype 16-bit transputer was the S43, which lacked the scheduler and DMA-controlled block transfer on the links. At launch, the T212 and M212 (the latter with an on-board disk controller) were the 16-bit offerings. The T212 was available in 17.5 and 20 MHz processor clock speed ratings. The T212 was superseded by the T222, with on-chip RAM expanded from 2 kB to 4 kB, and, later, the T225. This added debugging breakpoint
Breakpoint
In software development, a breakpoint is an intentional stopping or pausing place in a program, put in place for debugging purposes. It is also sometimes simply referred to as a pause....

 support (by extending the instruction J 0) plus some extra instructions from the T800 instruction set. Both the T222 and T225 ran at 20 MHz.

T4: 32-bit

At launch, the T414 was the 32-bit offering. Originally, the first 32-bit variant was to be the T424, but fabrication difficulties meant that this was redesigned as the T414 with 2 kB on-board RAM instead of the intended 4 kB. The T414 was available in 15 and 20 MHz varieties.
The RAM was later reinstated to 4 kB on the T425 (in 20, 25 and 30 MHz varieties), which also added the J 0 breakpoint support and extra T800 instructions. The T400, released in September 1989, was a low-cost 20 MHz T425 derivative with 2 kB and two instead of four links, intended for the embedded systems market.

T8: floating point

The second-generation T800 transputer, introduced in 1987, had an extended instruction set. The most important addition was a 64-bit floating point unit and three additional registers for floating point, implementing the IEEE754-1985 floating point standard. It also had 4 kB of on-board RAM and was available in 20 or 25 MHz versions. Breakpoint support was added in the later T801 and T805, the former featuring separate address and data buses to improve performance. The T805 was also later available as a 30 MHz part.

An enhanced T810 was planned, which would have had more RAM, more and faster links, extra instructions and improved microcode, but this was cancelled around 1990.

Inmos also produced a variety of support chips for the transputer processors, such as the C004 32-way link switch and the C012 "link adapter" which allowed transputer links to be interfaced to an 8-bit data bus.

T400

Part of the original Inmos strategy was to make CPUs so small and cheap that they could be combined with other logic in a single device. Although SOCs as they are commonly known, are ubiquitous now, the concept was almost unheard of back in the early 1980s. Two projects were started in around 1983, the M212 and the 'TV-toy'. The M212 was based on a standard T212 core with the addition of a disk controller for the ST 506 and ST 412 Shugart standards. 'TV-toy' was to be the basis for a games console and was joint project between Inmos and Sinclair Research.

The links in the T212 and T414/T424 transputers had hardware DMA engines so that transfers could happen in parallel with execution of other processes. A variant of the design, known as the T400, not to be confused with a later transputer of the same name, was designed where the CPU handled these transfers. This reduced the size of the device considerably since 4 link engines were approximately the same size as the whole CPU. The T400 was intended to be used as a core in what were then called 'SOS' ('systems on silicon') devices, now better known as SOCs. It was this design that was to form part of TV-toy. The project was canceled in 1985.

T100

Although the previous SOC projects had had only limited success (the M212 was in fact sold for a time), many designers still firmly believed in the concept and in 1987, a new project, the T100 was started which combined an 8-bit version of the transputer CPU with configurable logic based on state machines. The transputer instruction set is based on 8-bit instructions and can easily be used with any word size which is a multiple of 8 bits. The target market for the T100 was to be bus controllers such as Futurebus, as well as an upgrade for the standard link adapters (C011 etc.). The project was stopped when the T840 (later to become the basis of the T9000) was started. The concept didn't die though and XMOS
XMOS
XMOS is a fabless semiconductor company that develops multi-core multi-threaded processors designed to execute several real-time tasks, DSP, and control flow all at once.-Company history:...

 now designs chips for this purpose.

Markets

While the transputer was simple but powerful compared to many contemporary designs, it never came close to meeting its goal of being used universally in both CPU and microcontroller roles. In the microcontroller realm, the market was dominated by 8-bit machines where cost was the only serious consideration. Here, even the T2s were too powerful and expensive for most users.

In the computer desktop/workstation
Workstation
A workstation is a high-end microcomputer designed for technical or scientific applications. Intended primarily to be used by one person at a time, they are commonly connected to a local area network and run multi-user operating systems...

 world, the transputer was fairly fast (operating at about 10 MIPS at 20 MHz). This was excellent performance for the early 1980s, but by the time the FPU-equipped T800 was shipping, other RISC designs had surpassed it. This could have been mitigated to a large extent if machines had used multiple transputers as planned, but T800s cost about $400 each when introduced, which meant a poor price/performance ratio. Few transputer-based workstation systems were designed; the most notable probably being the Atari Transputer Workstation
Atari Transputer Workstation
The Atari Transputer Workstation was a workstation class computer released by Atari Corporation in the late 1980s. Based on the INMOS transputer, the machine was considerably more powerful than anything available on the market at the time...

.

The transputer was more successful in the field of massively parallel
Massively parallel
Massively parallel is a description which appears in computer science, life sciences, medical diagnostics, and other fields.A massively parallel computer is a distributed memory computer system which consists of many individual nodes, each of which is essentially an independent computer in itself,...

 computing, where several vendors produced transputer-based systems in the late 1980s. These included Meiko
Meiko Scientific
Meiko Scientific Ltd. was a British supercomputer company based in Bristol, founded by members of the design team working on the INMOS transputer microprocessor.-History:...

 (founded by ex-Inmos employees), Floating Point Systems
Floating Point Systems
Floating Point Systems Inc. was a Beaverton, Oregon vendor of minisupercomputers. The company was founded in 1970 by former Tektronix engineer Norm Winningstad....

, Parsytec (picture) and Parsys. Several British academic institutions founded research activities in the application of transputer-based parallel systems, including Bristol Polytechnic's Bristol Transputer Centre and the University of Edinburgh
University of Edinburgh
The University of Edinburgh, founded in 1583, is a public research university located in Edinburgh, the capital of Scotland, and a UNESCO World Heritage Site. The university is deeply embedded in the fabric of the city, with many of the buildings in the historic Old Town belonging to the university...

's Edinburgh Concurrent Supercomputer
Edinburgh Concurrent Supercomputer
The Edinburgh Concurrent Supercomputer was a large Meiko Computing Surface supercomputer. This transputer-based, massively parallel system was installed at the University of Edinburgh during the late 1980s and early 1990s.- History :...

 Project. In addition, the Data Acquisition and Second Level Trigger systems of the High Energy Physics ZEUS
Zeus
In the ancient Greek religion, Zeus was the "Father of Gods and men" who ruled the Olympians of Mount Olympus as a father ruled the family. He was the god of sky and thunder in Greek mythology. His Roman counterpart is Jupiter and his Etruscan counterpart is Tinia.Zeus was the child of Cronus...

 Experiment for the HERA
Hadron Elektron Ring Anlage
HERA was a particle accelerator at DESY in Hamburg. It began operating in 1992. At HERA, electrons or positrons were collided with protons at a center of mass energy of 318 GeV. It was the only lepton-proton collider in the world while operating...

 collider at DESY was based on a network of over 300 synchronously clocked transputers divided into several subsystems. These controlled both the readout of the custom detector electronics and ran reconstruction algorithms for physics event selection.

The parallel processing capabilities of the transputer were put to use commercially for image processing by the world's largest printing company, RR Donnelley & Sons, in the early 1990s. The ability to quickly transform digital images in preparation for print gave RR Donnelley a significant edge over their competitors. This development was led by Michael Bengtson in the RR Donnelley Technology Center. Within a few years, the processing capability of even desktop computers pushed aside the need for custom multi-processing systems for RR Donnelley.

The German company Jäger Messtechnik used transputers for their early ADwin real-time data acquisition and control products.

T9000

Inmos improved on the performance of the T8 series transputers with the introduction of the T9000 (code-named H1 during development). The T9000 shared most features with the T800, but moved several pieces of the design into hardware and added several features for superscalar
Superscalar
A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate...

 support. Unlike the earlier models, the T9000 had a true 16 kB high-speed cache
CPU cache
A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations...

 (using random-replacement) instead of RAM, but also allowed it to be used as memory and included MMU-like functionality to handle all of this (known as the PMI). For additional speed the T9000 cached the top 32 locations of the stack, instead of three as in earlier versions.

The T9000 used a five stage pipeline for even more speed. An interesting addition was the grouper which would collect instructions out of the cache and group them into larger packages of 4 bytes to feed the pipeline faster. Groups then completed in a single cycle, as if they were single larger instructions working on a faster CPU.

The link system was upgraded to a new 100 MHz mode, but unlike the previous systems the links were no longer downwardly compatible. This new packet-based link protocol was called DS-Link and later formed the basis of the IEEE 1355
IEEE 1355
IEEE Standard 1355-1995, IEC 14575, or ISO 14575 is a data communications standard for Heterogeneous Interconnect . It is a low-cost, low latency, scalable serial interconnection system, originally intended for communication between large numbers of inexpensive computers. It lacks many of the...

 serial interconnect standard. The T9000 also added link routing hardware called the VCP (Virtual Channel Processor) which changed the links from point-to-point to a true network, allowing for the creation of any number of virtual channels on the links. This meant programs no longer had to be aware of the physical layout of the connections. A range of DS-Link support chips were also developed, including the C104 32-way crossbar switch, and the C101 link adapter.

Long delays in the T9000's development meant that the faster load-store designs were already outperforming it by the time it was to be released. In fact it consistently failed to reach its own performance goal of beating by a factor of ten the T800: when the project was finally cancelled it was still achieving only about 36 MIPS at 50 MHz. The production delays gave rise to the quip that the best host architecture for a T9000 was an overhead projector.

This was too much for Inmos, which did not have the funding needed to continue development. By this time, the company had been sold to SGS-Thomson (now STMicroelectronics
STMicroelectronics
STMicroelectronics is an Italian-French electronics and semiconductor manufacturer headquartered in Geneva, Switzerland.While STMicroelectronics corporate headquarters and the headquarters for EMEA region are based in Geneva, the holding company, STMicroelectronics N.V. is registered in Amsterdam,...

), whose focus was the embedded systems market, and eventually the T9000 project was abandoned. However, a comprehensively redesigned 32-bit transputer intended for embedded applications, the ST20 series, was later produced, utilising some technology developed for the T9000. The ST20 core was incorporated into chipsets for set-top box
Set-top box
A set-top box or set-top unit is an information appliance device that generally contains a tuner and connects to a television set and an external source of signal, turning the signal into content which is then displayed on the television screen or other display device.-History:Before the...

 and GPS applications.

ST20

Although not strictly a transputer itself, the ST20 was heavily influenced by the T4 and T9 and did in fact form the basis of the T450 which was arguably the last of the transputers. The mission of the ST20 was to be a reusable core in the then emerging SOC market. In fact the original name of the ST20 was the RMC or Reusable Micro Core. The architecture was loosely based on the original T4 architecture with a microcode-controlled data path. It was however a complete redesign, using VHDL as the design language and with an optimized (and rewritten) microcode compiler. The project was conceived as early as 1990 when it was realized that the T9 would be too big for many applications. Actual design work started in mid 1992. Several trial designs were done, ranging from a very simple RISC-style CPU with complex instructions implemented in software via traps to a rather complex superscalar design similar in concept to the Tomasulo algorithm
Tomasulo algorithm
The Tomasulo algorithm is a hardware algorithm developed in 1967 by Robert Tomasulo from IBM. It allows sequential instructions that would normally be stalled due to certain dependencies to execute non-sequentially...

. The final design looked very similar to the original T4 core although some simple instruction grouping and a 'workspace cache' were added to help with performance.

Legacy

Ironically, additional internal parallelism has been the driving force behind improvements in conventional CPU designs. Instead of explicit thread-level parallelism (such as that found in the transputer), CPU designs exploited implicit parallelism at the instruction-level, inspecting code sequences for data dependencies and issuing multiple independent instructions to different execution units. This is known as superscalar
Superscalar
A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate...

 processing. Superscalar processors are suited for optimising the execution of sequentially-constructed fragments of code. The combination of superscalar processing and speculative execution
Speculation
In finance, speculation is a financial action that does not promise safety of the initial investment along with the return on the principal sum...

 delivered a tangible performance increase on existing bodies of code – which were mostly written in Pascal, Fortran, C and C++. Given these substantial and regular performance improvements to existing code there was little incentive to rewrite software in languages or coding styles which expose more task-level parallelism.

Nevertheless, the model of cooperating concurrent processors can still be found in cluster computing
Cluster Computing
Cluster Computing: the Journal of Networks, Software Tools and Applications is a journal for parallel processing, distributed computing systems, and computer communication networks....

 systems that dominate supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...

 design in the 21st century. Unlike the transputer architecture, the processing units in these systems typically utilize superscalar CPUs with access to substantial amounts of memory and disk storage, running conventional operating systems and network interfaces. Resulting from the more complex nodes, the software architecture used for coordinating the parallelism in such systems is typically far more heavyweight than in the transputer architecture.

The fundamental transputer motivation remains, yet was masked for over 20 years by the repeated doubling of transistor counts. Inevitably, microprocessor designers finally ran out of uses for the additional physical resources – almost at the same time when technology scaling began to hit its limits. Power consumption and therefore heat dissipation requirements render further clock rate
Clock rate
The clock rate typically refers to the frequency that a CPU is running at.For example, a crystal oscillator frequency reference typically is synonymous with a fixed sinusoidal waveform, a clock rate is that frequency reference translated by electronic circuitry into a corresponding square wave...

 increases infeasible. These factors lead the industry towards solutions a little different in essence from those proposed by Inmos.

The most powerful supercomputers in the world, based on designs from Columbia University and built as IBM Blue Gene
Blue Gene
Blue Gene is a computer architecture project to produce several supercomputers, designed to reach operating speeds in the PFLOPS range, and currently reaching sustained speeds of nearly 500 TFLOPS . It is a cooperative project among IBM Blue Gene is a computer architecture project to produce...

, are real-world incarnations of the transputer dream. They are vast assemblies of identical, relatively low-performance SoC chips.

Recent trends have also tried to solve the transistor dilemma in ways that would have been too futuristic even for Inmos. On top of adding components to the CPU die and placing multiple dies in one system, modern processors increasingly place multiple cores in a single die. The transputer designers struggled to fit even one core into its transistor budget. Today designers, working with a 1000-fold increase in transistors, can now typically place many. One of the most recent commercial developments has emerged from XMOS
XMOS
XMOS is a fabless semiconductor company that develops multi-core multi-threaded processors designed to execute several real-time tasks, DSP, and control flow all at once.-Company history:...

, which has developed a family of embedded multi-core multi-threaded processors which resonate strongly with the transputer and Inmos.

The transputer and Inmos both not only left a legacy on the computing world but also established Bristol, UK as a hub for microelectronic design and innovation.

Platforms currently using Transputers

The HETE-2
High Energy Transient Explorer
The High Energy Transient Explorer was an American astronomical satellite with international participation . The prime objective of HETE was to carry out the first multiwavelength study of gamma-ray bursts with UV, X-ray, and gamma-ray instruments mounted on a single, compact spacecraft...

 Spacecraft currently uses 4 x T805 transputers and 8 x DSP56001 yielding about 100 Mips
Instructions per second
Instructions per second is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads typically lead to significantly lower IPS values...

 of performance.

The Myriade micro-satellite
Miniaturized satellite
Miniaturized satellites or small satellites are artificial satellites of unusually low weights and small sizes, usually under . While all such satellites can be referred to as small satellites, different classifications are used to categorize them based on mass .One reason for miniaturizing...

 platform used in the Picard satellite uses a T805 transputer yielding around 4 MIPS

See also

  • David May
    David May (computer scientist)
    Michael David May, born February 24, 1951, is a British computer scientist. He is Professor of Computer Science at the University of Bristol and founder and Chief Technology Officer of XMOS Semiconductor.May was lead architect for the transputer...

    , transputer architect
  • Atari Transputer Workstation
    Atari Transputer Workstation
    The Atari Transputer Workstation was a workstation class computer released by Atari Corporation in the late 1980s. Based on the INMOS transputer, the machine was considerably more powerful than anything available on the market at the time...

  • IEEE 1355
    IEEE 1355
    IEEE Standard 1355-1995, IEC 14575, or ISO 14575 is a data communications standard for Heterogeneous Interconnect . It is a low-cost, low latency, scalable serial interconnection system, originally intended for communication between large numbers of inexpensive computers. It lacks many of the...

     data interconnect standard derived from T9000 DS-links
  • Meiko Computing Surface
  • iWarp
    IWarp
    iWarp was an experimental parallel supercomputer architecture developed as a joint project by Intel and Carnegie Mellon University. The project started in 1988, as a follow-up to CMU's previous WARP research project, in order to explore building an entire parallel-computing "node" in a single...

  • Ease programming language
    Ease programming language
    Ease is a general purpose parallel programming language, designed by Steven Ericsson-Zenith of Yale University. It combines the process constructs of CSP with logically shared data structures called contexts...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK