Sorcerer's Apprentice Syndrome
Encyclopedia
Sorcerer's Apprentice Syndrome (SAS) is a particularly bad network protocol flaw, discovered in the original versions of TFTP
Trivial File Transfer Protocol
Trivial File Transfer Protocol is a file transfer protocol known for its simplicity. It is generally used forautomated transfer of configuration or boot files between machines in a local environment....

. It was named after the "Sorcerer's Apprentice" segment of the animated film Fantasia
Fantasia (film)
Fantasia is a 1940 American animated film produced by Walt Disney and released by Walt Disney Productions. The third feature in the Walt Disney Animated Classics series, the film consists of eight animated segments set to pieces of classical music conducted by Leopold Stokowski, seven of which are...

, because the details of its operation closely resemble the disaster that befalls the sorcerer's apprentice: the problem resulted in an ever-growing replication of every packet in the transfer. The problem occurred because of a known failure mode of the internetwork which, through a mistake on the part of the protocol designers, was not taken into account when the protocol was designed; it interacted with several details of the mechanisms of TFTP to produce SAS.

Technical background

TFTP operates in a simple lock-step: there is only ever one packet outstanding at any time, and every packet received by either party caused one packet to be sent in reply (until the termination of the transfer). The TFTP specification said that any time any packet was received, the receiver was required to send the appropriate reply packet. Thus, the receipt of a block of data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 triggered the sending of an 'acknowledgement', and the receipt of an acknowledgement triggered the sending of the next data block. This may sound fairly harmless, but it led to disaster.

TFTP also, like all protocols designed to operate across an unreliable network, includes timeouts
Timeout (telecommunication)
In telecommunication and related engineering , the term timeout or time-out has several meanings, including...

. For example, when it does something to which it expects a reply from the party at the other end (such as sending it a packet), it starts a timer, and if the timer goes off and the reply has not been received, it takes some action; usually, the response is to re-send the original packet.

Details

SAS occurred when a packet was not lost in the internetwork, but rather simply delayed, and later successfully delivered, after a timeout had occurred (on either side).

The timeout caused a second copy of the previous packet to be generated, notionally to replace the 'lost' packet. However, the first copy was not lost, and since, according to the TFTP specification, receipt of any packet always forced the generation of a reply packet, two replies were generated (one to each copy). Those forced the generation of two replies to them, and so on. A typical scenario was as follows:
  • Computer S (source) sends data block X to computer D (destination)
  • Computer D receives block X, and sends an acknowledgement for X back to S
  • The packet containing the acknowledgement for X is delayed in the internetwork
  • Computer S times out, and resends data block X to D
  • Computer S receives the delayed acknowledgement for X, and sends data block X+1
  • Computer D receives the second copy of block X, and sends another acknowledgement for X back to S
  • Computer D receives block X+1, and sends an acknowledgement for X+1 back to S
  • Computer S receives the second acknowledgement for X, and sends a second copy of data block X+1
  • Computer S receives the acknowledgement for X+1, and sends data block X+2
  • Computer D receives the second copy of block X+1, and sends another acknowledgement for X+1 back to S
  • Computer D receives block X+2, and sends an acknowledgement for X+2 back to S


It will be seen that at this point the situation is now stable, and repeats; every packet from then on is duplicated (that is, two identical copies are sent across the internetwork).

Even worse, the increased number of packets being sent around the internetwork was likely to cause congestion
Network congestion
In data networking and queueing theory, network congestion occurs when a link or node is carrying so much data that its quality of service deteriorates. Typical effects include queueing delay, packet loss or the blocking of new connections...

, which was likely to cause a packet to be delayed past the timeout yet again, which would then cause yet another duplicate packet to be generated by a timeout, and from then on a third copy of each packet would be sent. Needless to say, at that point, the situation would usually snowball
Snowball effect
Snowball effect is a figurative term for a process that starts from an initial state of small significance and builds upon itself, becoming larger , and perhaps potentially dangerous or disastrous , though it might be beneficial instead...

, and further copies would be generated — hence the name given to this pattern of behaviour.

For a small file, the transfer would complete, and the duplicate packets would eventually drain from the internetwork. If the file were large, however, congestive collapse would result, and only when the transfer failed would the mass of packets drain from the internetwork.

Solution

The fix to SAS was quite simple: the TFTP specification was modified to indicate that only the first instance of a received acknowledgment would cause the next data block to be sent; further copies of the acknowledgment for a particular data block would be ignored, thus breaking the retransmission loop. In the new version of the protocol, a block would only be retransmitted on timeout.

This change also makes it possible to simplify the implementation of the receiving end (often, a bootstrap program written in a low level language) by omitting the retransmission timer, as any lost packet would cause retransmission of the last packet sent by the sender. However, keeping the timer has its benefits, such as dealing with lost ACKs more efficiently.

Further reading

  • Bob Braden
    Bob Braden
    Robert Braden is an American computer scientist who played a role in the development of the Internet. His research interests include end-to-end network protocols, especially in the transport and internetwork layers.-Career:...

    (editor), Requirements for Internet Hosts -- Application and Support (RFC 1123, USC/Information Sciences Institute, October 1989) See section 4.2
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK