Fork (software development)
Encyclopedia
In software engineering
Software engineering
Software Engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software...

, a project fork happens when developers take a legal copy of source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 from one software package
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....

 and start independent development on it, creating a distinct piece of software. The term implies not merely a development branch
Branching (software)
Branching, in revision control and software configuration management, is the duplication of an object under revision control so that modifications can happen in parallel along both branches....

, but a split in the developer community, analogous to a religious schism
Schism (religion)
A schism , from Greek σχίσμα, skhísma , is a division between people, usually belonging to an organization or movement religious denomination. The word is most frequently applied to a break of communion between two sections of Christianity that were previously a single body, or to a division within...

.

Free and open source software
Free and open source software
Free and open-source software or free/libre/open-source software is software that is liberally licensed to grant users the right to use, study, change, and improve its design through the availability of its source code...

 is that which, by definition, may be forked from the original development team without prior permission without violating any copyright
Copyright
Copyright is a legal concept, enacted by most governments, giving the creator of an original work exclusive rights to it, usually for a limited time...

 law. However, licensed forks of proprietary software (e.g. Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

) also happen.

Etymology

The term "fork" was first used in the sense of "branch
Branching (software)
Branching, in revision control and software configuration management, is the duplication of an object under revision control so that modifications can happen in parallel along both branches....

" by Eric Allman
Eric Allman
Eric Paul Allman is an American computer programmer who developed sendmail and its precursor delivermail in the late 1970s and early 1980s at UC Berkeley.-Education and training:...

 in 1980, to describe forming branches in sccs
Source Code Control System
Source Code Control System is an early revision control system, geared toward program source code and other text files. It was originally developed in SNOBOL at Bell Labs in 1972 by Marc J. Rochkind for an IBM System/370 computer running OS/360 MVT...

:
Creating a branch "forks off" a version of the program.


The term was in use on Usenet
Usenet
Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...

 by 1983 for the process of creating a subgroup to move topics of discussion to.

"Fork" is not known to have been used in the sense of a community schism during the origins of Lucid Emacs (now XEmacs
XEmacs
XEmacs is a graphical- and console-based text editor which runs on almost any Unix-like operating system as well as Microsoft Windows. XEmacs is a fork, based on a version of GNU Emacs from the late 1980s...

) (1991) or the BSD
Berkeley Software Distribution
Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

s (1993-4); Russ Nelson
Russ Nelson
Russell "Russ" Nelson is an American computer programmer. He was a founding board member of the Open Source Initiative and briefly served as its president in 2005.-Career:...

 used the term "shattering" for this sort of fork in 1993, attributing it to John Gilmore
John Gilmore
John Gilmore may refer to:* John Gilmore , co-founder of the Electronic Frontier Foundation and Cygnus Solutions* John Gilmore , American jazz saxophonist* John Gilmore , Pennsylvania politician...

. However, "fork" was in use in this sense by 1995 to describe the XEmacs split and an understood usage in the GNU
GNU
GNU is a Unix-like computer operating system developed by the GNU project, ultimately aiming to be a "complete Unix-compatible software system"...

 Project by 1996.

Forking free and open source software

Free and open source software may be legally forked without the approval of those currently managing a software project or distributing the software, per the definitions of "free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

" copyright license ("Freedom 3: The freedom to improve the program, and release your improvements to the public, so that the whole community benefits") and "open source
Open Source Definition
The Open Source Definition is a document published by the Open Source Initiative, to determine whether or not a software license can be labeled with the open-source certification mark....

" ("3. Derived Works: redistribution of modifications must be allowed. (To allow legal sharing and to permit new features or repairs.)").

In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases but typically only the larger group, or whoever controls the web site, will retain the full original name and the associated user community. Thus there is a reputation penalty associated with forking. The relationship between the different teams can be cordial or very bitter.

Forks are considered an expression of the freedom made available by free and open source software, but a weakness since they duplicate development efforts and can confuse users over which forked package to use. Developers have the option to collaborate and pool resources with free and open source software software, but it is not ensured by free software licenses, only by a commitment to cooperation.

Eric S. Raymond
Eric S. Raymond
Eric Steven Raymond , often referred to as ESR, is an American computer programmer, author and open source software advocate. After the 1997 publication of The Cathedral and the Bazaar, Raymond was for a number of years frequently quoted as an unofficial spokesman for the open source movement...

, in his seminal 1997 essay The Cathedral and the Bazaar
The Cathedral and the Bazaar
The Cathedral and the Bazaar is an essay by Eric S. Raymond on software engineering methods, based on his observations of the Linux kernel development process and his experiences managing an open source project, fetchmail. It examines the struggle between top-down and bottom-up design...

, stated that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community". He notes in the Jargon File
Jargon File
The Jargon File is a glossary of computer programmer slang. The original Jargon File was a collection of terms from technical cultures such as the MIT AI Lab, the Stanford AI Lab and others of the old ARPANET AI/LISP/PDP-10 communities, including Bolt, Beranek and Newman, Carnegie Mellon...

:
Forking is considered a Bad Thing—not merely because it implies a lot of wasted effort in the future, but because forks tend to be accompanied by a great deal of strife and acrimony between the successor groups over issues of legitimacy, succession, and design direction. There is serious social pressure against forking. As a result, major forks (such as the Gnu-Emacs/XEmacs
XEmacs
XEmacs is a graphical- and console-based text editor which runs on almost any Unix-like operating system as well as Microsoft Windows. XEmacs is a fork, based on a version of GNU Emacs from the late 1980s...

 split, the fissioning of the 386BSD
386BSD
386BSD, sometimes called "Jolix", was a free Unix-like operating system based on BSD, first released in 1992. It ran on PC compatible computer systems based on the Intel 80386 microprocessor...

 group into three daughter projects, and the short-lived GCC/EGCS split) are rare enough that they are remembered individually in hacker folklore.


In some cases, a fork can merge back into the original project or replace it. EGCS (the Experimental/Enhanced GNU Compiler System) was a fork from GCC
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 which proved more vital than the original project and was eventually "blessed" as the official GCC project. Some have attempted to invoke this effect deliberately, e.g., Mozilla Firefox
Mozilla Firefox
Mozilla Firefox is a free and open source web browser descended from the Mozilla Application Suite and managed by Mozilla Corporation. , Firefox is the second most widely used browser, with approximately 25% of worldwide usage share of web browsers...

 started as an unofficial project within Mozilla
Mozilla
Mozilla is a term used in a number of ways in relation to the Mozilla.org project and the Mozilla Foundation, their defunct commercial predecessor Netscape Communications Corporation, and their related application software....

 that soon replaced the Mozilla Suite as the focus of development.

It is easy to declare a fork, but can require considerable effort to continue independent development and support. As such, forks without adequate resources can soon become inactive, e.g., GoneME, a fork of GNOME
GNOME
GNOME is a desktop environment and graphical user interface that runs on top of a computer operating system. It is composed entirely of free and open source software...

 by a former developer, which was soon discontinued despite attracting some publicity. Some well-known forks have enjoyed great success, however, such as the X.Org
X.Org Server
X.Org Server refers to the X server release packages stewarded by the X.Org Foundation,which is hosted by freedesktop.org, and grants...

 X11
X Window System
The X window system is a computer software system and network protocol that provides a basis for graphical user interfaces and rich input device capability for networked computers...

 server, a fork from XFree86
XFree86
XFree86 is an implementation of the X Window System. It was originally written for Unix-like operating systems on IBM PC compatibles and is now available for many other operating systems and platforms. It is free and open source software under the XFree86 License version 1.1. It is developed by the...

 which gained widespread support from developers and users and notably sped up X development.

More recently, the use of distributed revision control
Distributed revision control
A distributed revision control system , distributed version control or decentralized version control keeps track of software revisions and allows many developers to work on a given project without necessarily being connected to a common network.-Distributed vs...

 (DVCS) tools has made the term "fork" less emotive. With a DVCS such as Mercurial
Mercurial
Mercurial is a cross-platform, distributed revision control tool for software developers. It is mainly implemented using the Python programming language, but includes a binary diff implementation written in C. It is supported on Windows and Unix-like systems, such as FreeBSD, Mac OS X and Linux...

 or Git
Git (software)
Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

, the normal way to contribute to a project is to first 'fork' the repository, and later seek to have your changes integrated with the main repository. These tools have been designed to make creating, maintaining and merging branches (internal forks) much easier than with a centralised VCS
Revision control
Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...

, and in so doing they eliminate the difference between a branch and a fork from the point of view of the VCS tool. In addition, sites such as Github
Github
GitHub is a web-based hosting service for software development projects that use the Git revision control system. GitHub offers both commercial plans and free accounts for open source projects...

, Bitbucket
Bitbucket
Bitbucket is a web-based hosting service for projects that use either the Mercurial or Git revision control systems. Bitbucket offers both commercial plans and free accounts...

 and Launchpad
Launchpad (website)
Launchpad is a web application and website that allow users to develop and maintain software, particularly free software. Launchpad is developed and maintained by Canonical Ltd....

 provide free DVCS hosting with very easy-to-use support for this kind of forking, so that the technical, social and financial barriers to forking a source code repository are massively reduced. While forking the community necessarily remains costly and painful, having many forks of the source code has become a more natural and accepted part of the development process (blurring the distinction between forks and branches).

Forks often restart version numbering from 0.1 or 1.0 even if the original software was at version 3.0, 4.0, or 5.0. An exception is when the forked software is designed to be a drop-in replacement of the original project, in which case, for example, forked version 5.2 is compatible with version 5.2 of the original software (as it happens in the case of MariaDB
MariaDB
MariaDB is a community-developed branch of the MySQL database, the impetus being the community maintenance of its free status under GPL, as opposed to any uncertainty of MySQL license status under its current ownership by Oracle....

 and MySQL
MySQL
MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

 as of 2011).

Forking proprietary software

In proprietary software
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...

, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed
Window (computing)
In computing, a window is a visual area containing some kind of user interface. It usually has a rectangular shape that can overlap with the area of other windows...

 version and a command line version, or versions for differing operating systems, such as a wordprocessor for IBM PC
IBM PC
The IBM Personal Computer, commonly known as the IBM PC, is the original version and progenitor of the IBM PC compatible hardware platform. It is IBM model number 5150, and was introduced on August 12, 1981...

 compatible machines and Macintosh computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share
Market share
Market share is the percentage of a market accounted for by a specific entity. In a survey of nearly 200 senior marketing managers, 67 percent responded that they found the "dollar market share" metric very useful, while 61% found "unit market share" very useful.Marketers need to be able to...

 and thus pay back the associated extra development costs created by the fork.

A notable proprietary fork not of this kind is the many varieties of proprietary Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 — all derived from AT&T Unix and all called "Unix", but increasingly mutually incompatible. See UNIX wars
Unix wars
The Unix wars were the struggles between vendors of the Unix computer operating system in the late 1980s and early 1990s to set the standard for Unix thenceforth.- Origins :...

.

The BSD licenses
BSD licenses
BSD licenses are a family of permissive free software licenses. The original license was used for the Berkeley Software Distribution , a Unix-like operating system after which it is named....

 permit forks to become proprietary software, and some say that commercial incentives thus make proprietisation almost inevitable. Examples include Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 (based on Nextstep
NEXTSTEP
NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...

 and FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

), Cedega and CrossOver
Crossover
-Fiction and media:* Fictional crossover, a storyline combining characters or settings from separate fictional properties** Fictional crossovers in video games* Crossover , a 2002 documentary by Justin Lin...

 (proprietary forks of Wine
Wine (software)
Wine is a free software application that aims to allow computer programs written for Microsoft Windows to run on Unix-like operating systems. Wine also provides a software library, known as Winelib, against which developers can compile Windows applications to help port them to Unix-like...

, though CrossOver tracks Wine and contributes considerably),
EnterpriseDB (a fork of PostgreSQL
PostgreSQL
PostgreSQL, often simply Postgres, is an object-relational database management system available for many platforms including Linux, FreeBSD, Solaris, MS Windows and Mac OS X. It is released under the PostgreSQL License, which is an MIT-style license, and is thus free and open source software...

, adding Oracle compatibility features), Fujitsu Supported PostgreSQL
with their proprietary ESM storage system, and Netezza's proprietary highly scalable derivative of PostgreSQL. Some of these vendors contribute back changes to the community project, while some keep their changes as their own competitive advantages.

Other notable forks

  • Most Linux distribution
    Linux distribution
    A Linux distribution is a member of the family of Unix-like operating systems built on top of the Linux kernel. Such distributions are operating systems including a large collection of software applications such as word processors, spreadsheets, media players, and database applications...

    s are descended from other distributions, most being traceable back to Debian
    Debian
    Debian is a computer operating system composed of software packages released as free and open source software primarily under the GNU General Public License along with other free software licenses. Debian GNU/Linux, which includes the GNU OS tools and Linux kernel, is a popular and influential...

    , Red Hat
    Red Hat Linux
    Red Hat Linux, assembled by the company Red Hat, was a popular Linux based operating system until its discontinuation in 2004.Red Hat Linux 1.0 was released on November 3, 1994...

     or Slackware
    Slackware
    Slackware is a free and open source Linux-based operating system. It was one of the earliest operating systems to be built on top of the Linux kernel and is the oldest currently being maintained. Slackware was created by Patrick Volkerding of Slackware Linux, Inc. in 1993...

    . Since most of the content of a distribution is free and open source software, ideas and software interchange freely as is useful to the individual distribution. Merges (e.g., United Linux
    United Linux
    United Linux was an attempt by a consortium of Linux distributors to create a common base distribution for enterprise use, so as to minimize duplication of engineering effort and form an effective competitor to Red Hat...

     or Mandriva
    Mandriva
    Mandriva S.A. is a publicly traded Linux and open source software company with its headquarters in Paris, France and development center in Curitiba, Brazil. Mandriva, S.A...

    ) are rare.
  • The game NetHack
    NetHack
    NetHack is a single-player roguelike video game originally released in 1987. It is a descendant of an earlier game called Hack , which is a descendant of Rogue...

    has spawned a number of variants using the original code, notably Slash'EM
    Slash'EM
    Slash'EM is a variant of the roguelike game NetHack that offers extra features, monsters, and items...

    , and was itself a fork of Hack.
  • OpenSSH
    OpenSSH
    OpenSSH is a set of computer programs providing encrypted communication sessions over a computer network using the SSH protocol...

     was a fork from SSH
    Secure Shell
    Secure Shell is a network protocol for secure data communication, remote shell services or command execution and other secure network services between two networked computers that it connects via a secure channel over an insecure network: a server and a client...

    , which happened because the license for SSH 2.x was non-free
    Free software
    Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

     (even though the source was available), so an older version of SSH 1.x, the last to have been licensed as free software
    Free software
    Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

    , was forked. Within months, virtually all Linux distributions, BSD versions and even some proprietary Unixes had replaced SSH with OpenSSH.
  • Oracle
    Oracle Corporation
    Oracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...

    's purchase of Sun Microsystems
    Sun Microsystems
    Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

     soon resulted in the forking of LibreOffice
    LibreOffice
    LibreOffice is a free and open source office suite developed by The Document Foundation as a fork of OpenOffice.org. It is largely compatible with other major office suites, including Microsoft Office, and available on a variety of platforms...

     from OpenOffice.org, and MariaDB
    MariaDB
    MariaDB is a community-developed branch of the MySQL database, the impetus being the community maintenance of its free status under GPL, as opposed to any uncertainty of MySQL license status under its current ownership by Oracle....

     from MySQL
    MySQL
    MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

    , due to concerns about Oracle's commitment to open-source development.

External links

  • Forking (David A. Wheeler
    David A. Wheeler
    David A. Wheeler is a computer scientist. He is best known for his work on Open source software/Free-libre software and Computer security.-Open Source Software:...

    )
  • Right to Fork at Meatball Wiki.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK