Zipper (data structure)
Encyclopedia
A zipper is a technique of representing an aggregate data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 so that it is convenient for writing programs that traverse the structure arbitrarily and update its contents, especially in purely
Purely functional
Purely functional is a term in computing used to describe algorithms, data structures or programming languages that exclude destructive modifications...

 functional programming languages. The zipper was described by Gérard Huet
Gérard Huet
Gérard Pierre Huet is a French computer scientist.- Biography :Gérard Huet graduated from the Université Denis Diderot , Case Western Reserve University, and the Université de Paris....

 in 1997. It includes and generalizes the gap buffer
Gap buffer
A gap buffer In computer science is a dynamic array that allows efficient insertion and deletion operations clustered near the same location. Gap buffers are especially common in text editors, where most changes to the text occur at or near the current location of the cursor. The text is stored in...

 technique sometimes used with arrays.

The zipper technique is general in the sense that it can be adapted to lists, trees
Tree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at...

, and other inductively-defined data structures.
Such modified data structures are usually referred to as "a tree with zipper" or "a list with zipper" to emphasize that the structure is conceptually a tree or list, while the zipper is a detail of the implementation.

Example: Bidirectional list traversal

Many common data structures in computer science can be expressed as the structure generated by a few primitive constructor operations or observer operations.
These include the structure of finite lists, which can be generated by two operations:
  • Empty: Constructs an empty list
  • Insert(x, L): Constructs a list by inserting value x in front of list L


The list [1, 2, 3] is then constructed as Insert(1, Insert(2, Insert(3, Empty))). It is possible to describe the location of a value in a list as the number of steps from the front of the list to that value.
More formally, a location is the number of additional Insert operations used to construct the whole list, after a particular value was inserted.

A context for a location in the list is constructed by recording not just the number of Insert operations, but all of the other information about them—namely, the values that were inserted.
These are represented in a separate list that is reversed from the order of the original data structure.
Specifically, the context of "3" in the list [1, 2, 3] is represented as [2, 1].
A list with a zipper represents the entire structure, and a location within the structure.
This is a pair consisting of the location's context, and the part of the structure that begins at the location. The list [1, 2, 3, 4] with a reference to the "3" is represented as ([2, 1], [3, 4]).

With the list represented this way, it is easy to define efficient operations that move the location forward or backward and manipulate the list at that location, for example by inserting or removing items.
Similarly, applying the zipper transformation to a tree makes it easy to insert or remove values at a particular location in the tree.

Uses

The zipper is often used where there is some concept of 'focus'
Focus (computing)
In computing, the focus indicates the component of the graphical user interface which is currently selected to receive input. Text entered at the keyboard or pasted from a clipboard is sent to the component which currently has the focus. Moving the focus away from a specific user interface element...

 or of moving around in some set of data, since its semantics reflect that of moving around but in a functional non-destructive manner.

The zipper has been used in
  • Xmonad
    Xmonad
    xmonad is a tiling window manager for the X Window System, written in the functional programming language Haskell.Begun in March 2007, it is similar to dwm, larswm, StumpWM and other members of the tiling window manager family, in that it arranges windows in a nonoverlapping tiled pattern and...

    , to manage focus and placement of windows
    Window (computing)
    In computing, a window is a visual area containing some kind of user interface. It usually has a rectangular shape that can overlap with the area of other windows...

  • Huet's papers cover a structural editor based on zippers and a theorem prover
  • A filesystem (ZipperFS) written in Haskell
    Haskell (programming language)
    Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

     offering "...transactional semantics; undo of any file and directory operation; snapshots; statically guaranteed the strongest, repeatable read, isolation mode for clients; pervasive copy-on-write for files and directories; built-in traversal facility; and just the right behavior for cyclic directory references."

Zipper contexts and differentiation

It has been shown that the type of the items in the context list produced by the zipper transformation is the "derivative" of the original type in a sense that is loosely related to differentiation
Derivative
In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much one quantity is changing in response to changes in some other quantity; for example, the derivative of the position of a...

 in calculus
Calculus
Calculus is a branch of mathematics focused on limits, functions, derivatives, integrals, and infinite series. This subject constitutes a major part of modern mathematics education. It has two major branches, differential calculus and integral calculus, which are related by the fundamental theorem...

. Intuitively, each item represents the "difference" between an internal substructure and its next containing structure.
When the types are described in a particular way, the representation of the original type looks like a polynomial
Polynomial
In mathematics, a polynomial is an expression of finite length constructed from variables and constants, using only the operations of addition, subtraction, multiplication, and non-negative integer exponents...

, and the representation of the type of context items looks like the derivative of that polynomial.

For instance, a binary tree
Binary tree
In computer science, a binary tree is a tree data structure in which each node has at most two child nodes, usually distinguished as "left" and "right". Nodes with children are parent nodes, and child nodes may contain references to their parents. Outside the tree, there is often a reference to...

 with nodes of type A can be represented by the inductive definition: 1+A×T2 → T, that is: a tree is either empty, or a tuple consisting of a value of type A and two subtrees. The derivative of 1+A×T2 is 2×A×T, that is: a tuple consisting of a binary flag, a value of type A and a sibling tree.
This is precisely the information needed, given a particular subtree, to construct its parent tree.
The boolean flag is an indicator of whether the subtree was on the left or right, the value of type A is the value at the parent, and the value of type T is the sibling subtree.
Then, one can reconstruct a tree given one of its subtrees and a list of such derivatives, called a path.
The subtree has been effectively separated out (or pointed out) and each item in the list contains the information to reconstruct a node along the path from subtree to the top of the tree.

Direct modification

In a non-purely-functional programming language, it may be more convenient to simply traverse the original data structure and modify it directly (perhaps after deep cloning
Object copy
An object copy is an action in computing where a data object has its attributes copied to another object of the same data type. An object is a composite data type in object-oriented programming languages. The copying of data is one of the most common procedures that occurs in computer programs...

 it, to avoid affecting other code that might hold a reference to it).

Generic zipper

The Generic Zipper is a technique to achieve the same goal as the conventional zipper without creating any type-specific data structure. It uses continuation
Continuation
In computer science and programming, a continuation is an abstract representation of the control state of a computer program. A continuation reifies the program control state, i.e...

s to allow revisiting earlier nodes. It also keeps track of which nodes contain changes, to enable sharing. (The Haskell code given in the reference uses generic programming to generate a traversal function for any data structure, but this is optional - any suitable traversal function can be used.)

However, the Generic Zipper involves inversion of control
Inversion of Control
In software engineering, Inversion of Control is an abstract principle describing an aspect of some software architecture designs in which the flow of control of a system is inverted in comparison to procedural programming....

, so some uses of it require a state machine (or equivalent) to keep track of what to do next.

Further reading


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK