Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
R (programming language)

R (programming language)

Overview
R is a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 and software environment for statistical
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

 computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis.
Discussion
Ask a question about 'R (programming language)'
Start a new discussion about 'R (programming language)'
Answer questions from other users
Full Discussion Forum
 
Unanswered Questions
Recent Discussions
Encyclopedia
R is a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 and software environment for statistical
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

 computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis.

R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. S was created by John Chambers
John Chambers (programmer)
John M. Chambers is the creator of the S programming language, and core member of the R programming language project. He was awarded the 1998 ACM Software System Award for developing S...

 while at Bell Labs. R was created by Ross Ihaka
Ross Ihaka
Ross Ihaka is an Associate Professor of Statistics at the University of Auckland, who is recognized, along with Robert Gentleman, as one of the originators of the R programming language...

 and Robert Gentleman
Robert Gentleman (statistician)
Robert C. Gentleman is a Canadian statistician and bioinformatician currently working for Genentech. He is recognized, along with Ross Ihaka, as one of the originators of the R programming language and associated software packages like Bioconductor. He got his Ph.D...

 at the University of Auckland
University of Auckland
The University of Auckland is a university located in Auckland, New Zealand. It is the largest university in the country and the highest ranked in the 2011 QS World University Rankings, having been ranked worldwide...

, New Zealand
New Zealand
New Zealand is an island country in the south-western Pacific Ocean comprising two main landmasses and numerous smaller islands. The country is situated some east of Australia across the Tasman Sea, and roughly south of the Pacific island nations of New Caledonia, Fiji, and Tonga...

, and now, R is developed by the R Development Core Team, of which Chambers is a member. R is named partly after the first names of the first two R authors (Robert Gentleman and Ross Ihaka), and partly as a play on the name of S.

R is part of the GNU project
GNU Project
The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

. Its source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 is freely available under the GNU General Public License
GNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

, and pre-compiled binary versions are provided for various operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s. R uses a command line interface; however, several graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

s are available for use with R.

Statistical features


R provides a wide variety of statistical and graphical techniques, including linear
Linear
In mathematics, a linear map or function f is a function which satisfies the following two properties:* Additivity : f = f + f...

 and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages. There are some important differences, but much code written for S runs unaltered. Many of R's standard functions are written in R itself, which makes it easy for users to follow the algorithmic choices made. For computationally intensive tasks, C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, and Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 code can be linked and called at run time. Advanced users can write C or Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 code to manipulate R objects directly.

R is highly extensible through the use of user-submitted packages for specific functions or specific areas of study. Due to its S heritage, R has stronger object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 facilities than most statistical computing languages. Extending R is also eased by its permissive lexical scoping rules.

According to Rexer's Annual Data Miner Survey
Rexer's Annual Data Miner Survey
Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining professionals in the industry. It consists of approximately 50 multiple choice and open-ended questions that cover seven general areas of data mining science and practice: Field and goals, Algorithms, Models, Tools...

 in 2010, R has become the data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

 tool used by more data miners (43%) than any other.

Another strength of R is static graphics, which can produce publication-quality graphs, including mathematical symbols. Dynamic and interactive graphics are available through additional packages such as RGL
RGL
RGL is a software package for the R programming language. It extends the R programming environment with a 3D real-time Visualization Device System.At the core, RGL is a 3D engine written in C++ using OpenGL. It provides an API for the R Programming Language...

.

R has its own LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hard copy.

Programming features


R is an interpreted language typically used through a command line interpreter. If one types "2+2" at the command prompt and presses enter, the computer replies with "4".


> 2+2
[1] 4


Like many other languages, R supports matrix arithmetic
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

. R's data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

s include scalars
Scalar (computing)
In computing, a scalar variable or field is one that can hold only one value at a time; as opposed to composite variables like array, list, hash, record, etc. In some contexts, a scalar value may be understood to be numeric. A scalar data type is the type of a scalar variable...

, vectors, matrices
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

, data frames (similar to tables
Table (database)
In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical columns and horizontal rows. A table has a specified number of columns, but can have any number of rows...

 in a relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

) and lists. The R object system has been extended by package authors to define objects for regression models
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

, time-series and geo-spatial coordinates
Spatial analysis
Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties...

.

R supports procedural programming
Procedural programming
Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm, derived from structured programming, based upon the concept of the procedure call...

 with functions and, for some functions, object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 with generic function
Generic function
In certain systems for object-oriented programming such as the Common Lisp Object System and Dylan, a generic function is an entity made up of all methods having the same name. Typically a generic function itself is an instance of a class that inherits both from function and standard-object...

s. A generic function acts differently depending on the type of arguments it is passed. In other words the generic function dispatches
Dynamic dispatch
In computer science, dynamic dispatch is the process of mapping a message to a specific sequence of code at runtime. This is done to support the cases where the appropriate method can't be determined at compile-time...

 the function (method
Method (computer science)
In object-oriented programming, a method is a subroutine associated with a class. Methods define the behavior to be exhibited by instances of the associated class at program run time...

) specific to that type of object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

. For example, R has a generic
Generic function
In certain systems for object-oriented programming such as the Common Lisp Object System and Dylan, a generic function is an entity made up of all methods having the same name. Typically a generic function itself is an instance of a class that inherits both from function and standard-object...

 print function that can print almost every type of object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

 in R with a simple "print(objectname)" syntax.

Although R is mostly used by statisticians and other practitioners requiring an environment for statistical computation and software development, it can also be used as a general matrix calculation
Numerical linear algebra
Numerical linear algebra is the study of algorithms for performing linear algebra computations, most notably matrix operations, on computers. It is often a fundamental part of engineering and computational science problems, such as image and signal processing, Telecommunication, computational...

 toolbox with performance benchmarks comparable to GNU Octave
GNU Octave
GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command-line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with MATLAB...

 or MATLAB
MATLAB
MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

.

Example 1


The following examples illustrate the basic syntax of the language and use of the command-line interface.

In R and S, the assignment operator
Assignment (computer science)
In computer programming, an assignment statement sets or re-sets the value stored in the storage location denoted by a variable name. In most imperative computer programming languages, assignment statements are one of the basic statements...

 is an arrow made from two characters "<-".

> x <- c(1,2,3,4,5,6) # Create ordered collection (vector)
> y <- x^2 # Square the elements of x
> print(y) # print (vector) y
[1] 1 4 9 16 25 36
> mean(y) # Calculate average (arithmetic mean) of (vector) y; result is scalar
[1] 15.16667
> var(y) # Calculate sample variance
[1] 178.9667
> lm_1 <- lm(y ~ x) # Fit a linear regression model "y = f(x)" or "y = B0 + (B1 * x)"
# store the results as lm_1
> print(lm_1) # Print the model from the (linear model object) lm_1

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept) x
-9.333 7.000

> summary(lm_1) # Compute and print statistics for the fit of the (linear model object) lm_1

Call:
lm(formula = y ~ x)

Residuals:
1 2 3 4 5 6
3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.3333 2.8441 -3.282 0.030453 *
x 7.0000 0.7303 9.585 0.000662 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.055 on 4 degrees of freedom
Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478
F-statistic: 91.88 on 1 and 4 DF, p-value: 0.000662

> par(mfrow=c(2, 2)) # Request 2x2 plot layout
> plot(lm_1) # Diagnostic plot of regression model

Example 2


Short R code calculating Mandelbrot set
Mandelbrot set
The Mandelbrot set is a particular mathematical set of points, whose boundary generates a distinctive and easily recognisable two-dimensional fractal shape...

 through the first 20 iterations of equation z = z² + c plotted for different complex constants c. This example demonstrates:
  • use of community developed external libraries (called packages), in this case caTools package
  • handling of complex numbers
  • multidimensional arrays of numbers used as basic data type, see variables C, Z and X


library(caTools) # external package providing write.gif function
jet.colors <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F",
"yellow", "#FF7F00", "red", "#7F0000"))
m <- 1200 # define size
C <- complex( real=rep(seq(-1.8,0.6, length.out=m), each=m ),
imag=rep(seq(-1.2,1.2, length.out=m), m ) )
C <- matrix(C,m,m) # reshape as square matrix of complex numbers
Z <- 0 # initialize Z to zero
X <- array(0, c(m,m,20)) # initialize output 3D array
for (k in 1:20) { # loop with 20 iterations
Z <- Z^2+C # the central difference equation
X[,,k] <- exp(-abs(Z)) # capture results
}
write.gif(X, "Mandelbrot.gif", col=jet.colors, delay=100)

Packages


The capabilities of R are extended through user-created packages, which allow specialized statistical techniques, graphical devices, import/export capabilities, reporting tools, etc. These packages are developed primarily in R, and sometimes in Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 and Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

. A core set of packages are included with the installation of R, with more than 4300 available at the Comprehensive R Archive Network (CRAN), Bioconductor
Bioconductor
Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology....

, and other repositories.
The "Task Views" page (subject list) on the CRAN website lists the wide range of applications (Finance, Genetics, Machine Learning, Medical Imaging, Social Sciences and Spatial statistics) to which R has been applied and for which packages are available.

Other R package resources include Crantastic, a community site for rating and reviewing all CRAN packages, and also R-Forge, a central platform for the collaborative development of R packages, R-related software, and projects. It hosts many unpublished, beta packages, and development versions of CRAN packages.

The Bioconductor
Bioconductor
Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology....

 project provides R packages for the analysis of genomic data, such as Affymetrix
Affymetrix
Affymetrix is a company that manufactures DNA microarrays; it is based in Santa Clara, California, United States. The company was founded by Dr. Stephen Fodor in 1992. It began as a unit in Affymax N.V...

 and cDNA
Complementary DNA
In genetics, complementary DNA is DNA synthesized from a messenger RNA template in a reaction catalyzed by the enzyme reverse transcriptase and the enzyme DNA polymerase. cDNA is often used to clone eukaryotic genes in prokaryotes...

 microarray
Microarray
A microarray is a multiplex lab-on-a-chip. It is a 2D array on a solid substrate that assays large amounts of biological material using high-throughput screening methods.Types of microarrays include:...

 object-oriented data handling and analysis tools, and has started to provide tools for analysis of data from next-generation high-throughput sequencing methods.

Reproducible research and automated report generation can be accomplished with packages such as Sweave and odfWeave that support
execution of R code embedded within LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

, OpenDocument format
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....

 and other markups.

Milestones



The full list of changes is maintained in the NEWS file. Some highlights are listed below.
  • Version 0.16 – This is the last alpha version developed primarily by Ihaka and Gentleman. Much of the basic functionality from the "White Book" (see S history) was implemented. The mailing lists commenced on April 1, 1997.
  • Version 0.49 – April 23, 1997 – This is the oldest available source
    Source code
    In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

     release, and compiles on a limited number of Unix-like platforms. CRAN is started on this date, with 3 mirrors that initially hosted 12 packages. Alpha versions of R for Microsoft Windows and Mac OS
    Mac OS
    Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...

     are made available shortly after this version.
  • Version 0.60 – December 5, 1997 – R becomes an official part of the GNU Project
    GNU Project
    The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

    . The code is hosted and maintained on CVS
    Concurrent Versions System
    The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...

    .
  • Version 1.0.0 – February 29, 2000 – Considered by its developers stable enough for production use.
  • Version 1.4.0 – S4 methods are introduced and the first version for Mac OS X
    Mac OS X
    Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

     is made available soon after.
  • Version 2.0.0 – October 4, 2004 – Introduced lazy loading
    Lazy loading
    Lazy loading is a design pattern commonly used in computer programming to defer initialization of an object until the point at which it is needed. It can contribute to efficiency in the program's operation if properly and appropriately used...

    , which enables fast loading of data with minimal expense of system memory.
  • Version 2.1.0 – Support for UTF-8
    UTF-8
    UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

     encoding, and the beginnings of internationalization and localization
    Internationalization and localization
    In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...

     for different languages.
  • Version 2.11.0 – April 22, 2010 – Support for Windows 64 bit systems.
  • Version 2.13.0 – April 14, 2011 – Adding a new compiler function that allows speeding up functions by converting them to byte-code.
  • Version 2.14.0 - October 31, 2011 - Added mandatory namespaces for packages. Added a new parallel package.

Graphical user interfaces

  • RGUI – comes with the pre-compiled version of R
  • Java Gui for R
    Java Gui for R
    JGR is a universal and unified Graphical User Interface for the R programming language, licensed under the GNU General Public License.JGR is a cross-platform stand-alone R terminal, and can be used as a more advanced substitute to the default Rgui or to a simple R session started from a terminal...

     – cross-platform stand-alone R terminal and editor based on Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

     (also known as JGR)
  • Deducer - GUI for menu driven data analysis (similar to SPSS
    SPSS
    SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

    /JMP
    JMP
    JMP may refer to:* JMP , a statistical analysis application by SAS Institute, Inc.* JMP * Jean-Marie Pfaff, a Belgian football goalkeeper* Joint Monitoring Programme for Water Supply and Sanitation...

    /Minitab
    Minitab
    Minitab is a statistics package. It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in 1972...

    ).
  • Rattle GUI
    Rattle GUI
    Rattle GUI is a free and open source software package providing a graphical user interface for Data Mining using the R statistical programming language. The source code available at http://rattle.googlecode.com. Rattle is currently used around the world, in a variety of situations...

     – cross-platform GUI based on RGtk2 and specifically designed for data mining
    Data mining
    Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

  • R Commander
    R Commander
    R Commander is a GUI for the R programming language, licensed under the GNU General Public License. Among the existing R GUIs, Rcmdr together with its plug-ins is perhaps the most viable R-alternative to commercial statistical packages like SPSS...

     – cross-platform menu-driven GUI based on tcl
    Tcl
    Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

    tk (several plug-ins to Rcmdr are also available)
  • RapidMiner
  • RExcel
    RExcel
    RExcel is an addin for Microsoft Excel. It allows access to the statistics package R from within Excel.The main features are:* Data transfer between R and Excel in both directions* Running R code directly from Excel ranges...

     – using R and Rcmdr from within Microsoft Excel
    Microsoft Excel
    Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

  • Red-R – visual analysis interface that uses R for statistics
  • RKWard
    RKWard
    RKWard is a transparent front-end to the R programming language, a scripting-language with a strong focus on statistic functions. RKWard tries to combine the power of the R-language with the ease of use of commercial statistical packages....

     – extensible GUI and IDE for R
  • R AnalyticFlow - analysis flowcharts with R (freeware)
  • RStudio - cross-platform open source IDE (which can also be run on a remote linux server)
  • Weka
    Weka (machine learning)
    Weka is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand...

     allows for the use of the data mining capabilities in Weka and statistical analysis in R.

Editors and IDEs


Text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....

s and Integrated development environment
Integrated development environment
An integrated development environment is a software application that provides comprehensive facilities to computer programmers for software development...

s (IDEs) with some support for R include:
Bluefish
Bluefish (text editor)
Bluefish is a web design editor focused towards the development of dynamic websites. Bluefish supports development in HTML, XHTML, CSS, XML, PHP, C, C++, JavaScript, Java, Google Go, Vala, Ada, D, SQL, Perl, ColdFusion, JSP, Python, Ruby and shell. Bluefish is available on most platforms,...

,
Crimson Editor
Crimson Editor
Crimson Editor is an open-source text editor. It is typically used as a source code editor, and HTML editor, for Microsoft Windows. The author was Ingyu Kang.-History:...

, RStudio,
ConTEXT
ConTEXT
ConTEXT is a text editor for Microsoft Windows that can open and edit very large files, while requiring only modest amounts of RAM and hard drive space to run....

,
Eclipse
Eclipse (software)
Eclipse is a multi-language software development environment comprising an integrated development environment and an extensible plug-in system...

,
Emacs
Emacs
Emacs is a class of text editors, usually characterized by their extensibility. GNU Emacs has over 1,000 commands. It also allows the user to combine these commands into macros to automate work.Development began in the mid-1970s and continues actively...

 (Emacs Speaks Statistics
Emacs Speaks Statistics
Emacs Speaks Statistics is an Emacs package of modes for statistical languages. It adds two types of modes to emacs:# ESS modes for editing statistical languages like R and SAS; and...

),
Vim
Vim (text editor)
Vim is a text editor written by Bram Moolenaar and first released publicly in 1991. Based on the vi editor common to Unix-like systems, Vim is designed for use both from a command line interface and as a standalone application in a graphical user interface...

,
Tinn-R,
Geany
Geany
Geany is a lightweight cross-platform GTK+ text editor based on Scintilla and including basic Integrated Development Environment features. It is designed to have short load times, with limited dependency on separate packages or external libraries. It is available for a wide range of operating...

,
jEdit
JEdit
jEdit is a text editor for programmers, available under the GNU General Public License version 2.0. It is written in Java and runs on any operating system with Java support, including Windows, Linux, Mac OS X, and BSD.-Development:...

,
Kate
Kate (text editor)
In computing, Kate is a text editor by KDE. The name Kate is an acronym for KDE Advanced Text Editor.-History:Kate has been part of KDE Software Compilation since release 2.2 in 2001. Because of the KParts technology, it is possible to embed Kate as an editing component in other KDE applications...

,
R Productivity Environment (part of Revolution R Enterprise),
TextMate
TextMate
TextMate is a general-purpose GUI text editor for Mac OS X created by Allan Odgaard. Popular with programmers, some notable features include declarative customizations, tabs for open documents, recordable macros, folding sections and snippets, shell integration, and an extensible bundle...

,
gedit
Gedit
gedit is a text editor for the GNOME desktop environment, Mac OS X and Microsoft Windows. Designed as a general purpose text editor, gedit emphasizes simplicity and ease of use...

, SciTE
SciTE
SciTE or SCIntilla based Text Editor is a cross-platform text editor written by Neil Hodgson using the Scintilla editing component. It is licensed under a minimal version of the Historical Permission Notice and Disclaimer...

, WinEdt
WinEdt
-External links:* *...

 (R Package RWinEdt), notepad++
Notepad++
Notepad++ is a text editor and source code editor for Windows. One advantage of Notepad++ over the built-in Windows text editor, Notepad, is tabbed editing, which allows working with multiple open files.Notepad++ is distributed as free software...

,.

Scripting languages


R functionality has been made accessible from several scripting languages such as Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 (by the RPy interface package), Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 (by the Statistics::R module) and Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

 (with the rsruby rubygem).
Scripting in R itself is possible via littler as well as via Rscript.

Comparison with SAS, SPSS and Stata


The general consensus is that R compares well with other popular statistical packages, such as SAS, SPSS
SPSS
SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

 and Stata
Stata
Stata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world...

. In January 2009, the New York Times ran an article about R gaining acceptance among data analysts and presenting a potential threat for the market share occupied by commercial statistical packages, such as SAS.

Commercial support for R


In 2007, Revolution Analytics
Revolution Analytics
Revolution Analytics is a statistical software company focused on developing "open-core" versions of the free and open source software R for enterprise, academic and analytics customers...

 was founded to provide commercial support for Revolution R, its distribution of R which also includes components developed by the company. Major additional components include: ParallelR, the R Productivity Environment IDE , RevoScaleR (for big data
Big data
Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics, and visualizing...

 analysis) , RevoDeployR, web services framework and the ability for reading and writing data in the SAS file format.

In Oct 2011, Oracle
Oracle Corporation
Oracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...

 announced the Big Data Appliance, which integrates R, Apache Hadoop, Oracle Enterprise Linux
Oracle Enterprise Linux
Oracle Linux, formerly known as Oracle Enterprise Linux, is a Red Hat Enterprise Linux-compatible distribution, repackaged and sold by Oracle, available under the GNU General Public License since late 2006....

, a NoSQL
Nosql
In computing, NoSQL is a broad class of database management systems that differ from the classic model of the relational database management system in some significant ways. These data stores may not require fixed table schemas, usually avoid join operations, and typically scale horizontally...

 database with the Exadata hardware.

Other major commercial software systems supporting connections to R include: Spotfire
Spotfire
Spotfire was a business intelligence company based in Somerville, Massachusetts. It was bought by TIBCO in 2007.-History:Spotfire's origins trace back to the Human-Computer Interaction Laboratory at the University of Maryland, College Park where, in the early 1990s, Christopher Ahlberg, a visiting...

, SPSS
SPSS
SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

, STATISTICA
STATISTICA
STATISTICA is a statistics and analytics software package developed by StatSoft. STATISTICA provides data analysis, data management, data mining, and data visualization procedures...

, Platform Symphony
Symphony (software)
Platform Symphony is a High-performance computing software system developed by Platform Computing, the company that developed Load Sharing Facility . Focusing on the Financial Services Industry , Symphony is designed to deliver scalability and enhances performance for compute-intensive risk and...

 ,
SAS
SAS
- Special forces :* Special Air Service, a special forces unit of the British Army* Australian Special Air Service Regiment * New Zealand Special Air Service * Rhodesian Special Air Service...


See also



  • List of statistical packages
  • Comparison of statistical packages
    Comparison of statistical packages
    The following tables compare general and technical information for a number of statistical analysis packages.-General information:Basic information about each product...

  • List of numerical analysis software
  • Comparison of numerical analysis software
    Comparison of numerical analysis software
    The following tables provide a comparison of numerical analysis software.- General :- Operating system support :The operating systems the software can run on natively .- Language features :Colors indicate features available as...

  • Free statistical software
    Free statistical software
    In this article, the word free generally means can be legally obtained without paying any money . Just a few of the software packages mentioned here are also free as in the sense of free speech: they are not only open source but also free software in the sense that the source code of the software...

  • Sweave
    Sweave
    Sweave is a function in the statistical programming language R that enables integration of R code into LaTeX or LyX documents. The purpose is "to create dynamic reports, which can be updated automatically if data or analysis change"....

  • ggplot2
    Ggplot2
    ggplot2 is a data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 as an implementation of Leland Wilkinson's Grammar of Graphics--a general scheme for data visualization which breaks up graph into semantic components such as scales and...


External links



of the R project
  • The R wiki, a community wiki for R
  • R books, has extensive list (with brief comments) of R-related books
  • The R Graphical Manual, a collection of R graphics from all R packages, and an index to all functions in all R packages
  • R seek, a custom frontend to Google search engine, to assist in finding results related to the R language