OpenEpi
Encyclopedia
OpenEpi is a free
Free statistical software
In this article, the word free generally means can be legally obtained without paying any money . Just a few of the software packages mentioned here are also free as in the sense of free speech: they are not only open source but also free software in the sense that the source code of the software...

, web-based, open source, operating system-independent series of programs for use in epidemiology
Epidemiology
Epidemiology is the study of health-event, health-characteristic, or health-determinant patterns in a population. It is the cornerstone method of public health research, and helps inform policy decisions and evidence-based medicine by identifying risk factors for disease and targets for preventive...

, biostatistics
Biostatistics
Biostatistics is the application of statistics to a wide range of topics in biology...

, public health
Public health
Public health is "the science and art of preventing disease, prolonging life and promoting health through the organized efforts and informed choices of society, organizations, public and private, communities and individuals" . It is concerned with threats to health based on population health...

, and medicine
Medicine
Medicine is the science and art of healing. It encompasses a variety of health care practices evolved to maintain and restore health by the prevention and treatment of illness....

, providing a number of epidemiologic and statistical tools for summary data. OpenEpi was developed in JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 and HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

, and can be run in modern web browsers. The program can be run from the OpenEpi website or downloaded and run without a web connection. The source code and documentation is downloadable and freely available for use by other investigators. OpenEpi has been reviewed, both by media organizations and in research journals.

The OpenEpi developers have had extensive experience in the development and testing of Epi Info
Epi Info
Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

, a program developed by the Centers for Disease Control and Prevention
Centers for Disease Control and Prevention
The Centers for Disease Control and Prevention are a United States federal agency under the Department of Health and Human Services headquartered in Druid Hills, unincorporated DeKalb County, Georgia, in Greater Atlanta...

 (CDC) and widely used around the world for data entry and analysis. OpenEpi was developed to perform analyses found in the DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

 version of Epi Info
Epi Info
Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

 modules StatCalc and EpiTable, to improve upon the types of analyses provided by these modules, and to provide a number of tools and calculations not currently available in Epi Info. It is the first step toward an entirely web-based set of epidemiologic software tools. OpenEpi can be thought of as an important companion to Epi Info
Epi Info
Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

 and to other programs such as SAS
SAS System
SAS is an integrated system of software products provided by SAS Institute Inc. that enables programmers to perform:* retrieval, management, and mining* report writing and graphics* statistical analysis...

, PSPP
PSPP
PSPP is a free software application for analysis of sampled data. It has a graphical user interface and conventional command line interface. It is written in C, uses GNU Scientific Library for its mathematical routines, and plotutils for generating graphs....

, SPSS
SPSS
SPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....

, Stata
Stata
Stata is a general-purpose statistical software package created in 1985 by StataCorp. It is used by many businesses and academic institutions around the world...

, SYSTAT
SYSTAT
SYSTAT is a statistics and statistical graphics software package, developed by Leland Wilkinson in the late 1970s, who was at the time an assistant professor of psychology at the University of Illinois at Chicago...

, Minitab
Minitab
Minitab is a statistics package. It was developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in 1972...

, Epidata
Epidata
EpiData refers to a group of applications used in combination for creating documented data structures and analysis of quantitative data. The EpiData Association, which created the software, was created in 1999 and is based in Denmark...

, and R (see the R programming language
R (programming language)
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

). Another functionally similar Windows-based program is Winpepi
Winpepi
WinPepi is a freeware package of statistical programs for epidemiologists, comprising seven programs with over 120 modules. WinPepi is not a complete compendium of statistical routines for epidemiologists but it provides a very wide range of procedures, including those most commonly used and many...

. See also list of statistical packages and comparison of statistical packages
Comparison of statistical packages
The following tables compare general and technical information for a number of statistical analysis packages.-General information:Basic information about each product...

. Both OpenEpi and Epi Info
Epi Info
Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

 were developed with the goal of providing tools for low and moderate resource areas of the world. The initial development of OpenEpi was supported by a grant from the Bill and Melinda Gates Foundation to Emory University
Emory University
Emory University is a private research university in metropolitan Atlanta, located in the Druid Hills section of unincorporated DeKalb County, Georgia, United States. The university was founded as Emory College in 1836 in Oxford, Georgia by a small group of Methodists and was named in honor of...

.

The types of calculations currently performed by OpenEpi include:
  • Various confidence interval
    Confidence interval
    In statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...

    s for proportions, rates, standardized mortality ratio
    Standardized mortality ratio
    The standardized mortality ratio or SMR in epidemiology is the ratio of observed deaths to expected deaths, where expected deaths are calculated for a typical area with the same age and gender mix by looking at the death rates for different ages and genders in the larger population.The SMR may be...

    , mean
    Mean
    In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

    , median
    Median
    In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...

    , percentiles
  • 2x2 crude and stratified tables for count and rate data
  • Matched case-control
    Case-control
    A case-control study is a type of study design in epidemiology. Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition with patients who do not have the condition but are otherwise similar .Case-control studies are...

     analysis
  • Test for trend with count data
  • Independent t-test and one-way ANOVA
  • Diagnostic and screening test analyses with receiver operating characteristic
    Receiver operating characteristic
    In signal detection theory, a receiver operating characteristic , or simply ROC curve, is a graphical plot of the sensitivity, or true positive rate, vs. false positive rate , for a binary classifier system as its discrimination threshold is varied...

     (ROC) curves
  • Sample size
    Sample size
    Sample size determination is the act of choosing the number of observations to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample...

     for proportions, cross-sectional surveys, unmatched case-control
    Case-control
    A case-control study is a type of study design in epidemiology. Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition with patients who do not have the condition but are otherwise similar .Case-control studies are...

    , cohort
    Cohort study
    A cohort study or panel study is a form of longitudinal study used in medicine, social science, actuarial science, and ecology. It is an analysis of risk factors and follows a group of people who do not have the disease, and uses correlations to determine the absolute risk of subject contraction...

    , randomized controlled trial
    Randomized controlled trial
    A randomized controlled trial is a type of scientific experiment - a form of clinical trial - most commonly used in testing the safety and efficacy or effectiveness of healthcare services or health technologies A randomized controlled trial (RCT) is a type of scientific experiment - a form of...

    s, and comparison of two means
  • Power calculations for proportions (unmatched case-control
    Case-control
    A case-control study is a type of study design in epidemiology. Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition with patients who do not have the condition but are otherwise similar .Case-control studies are...

    , cross-sectional, cohort, randomized controlled trials) and for the comparison of two means
  • Random number generator


For epidemiologists and other health researchers, OpenEpi performs a number of calculations based on tables not found in most epidemiologic and statistical packages. For example, for a single 2x2 table, in addition to the results presented in other programs, OpenEpi provides estimates for:
  • Etiologic or prevented fraction in the population and in exposed with confidence intervals, based on risk, odds, or rate data
  • The cross-product and MLE odds ratio
    Odds ratio
    The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

     estimate
  • Mid-p exact p-values and confidence limits for the odds ratio
    Odds ratio
    The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

  • Calculations of rate ratios and rate differences with confidence intervals and statistical tests.


For stratified 2x2 tables with count data, OpenEpi provides:
  • Mantel-Haenszel (MH) and precision-based estimates of the risk ratio and odds ratio
  • Precision-based adjusted risk difference
  • Tests for interaction for the risk ratio, odds ratio
    Odds ratio
    The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

    , and risk difference
  • Four different confidence limit methods for the odds ratio.


Similar to Epi Info, in a stratified analysis, both crude and adjusted estimates are provided so that the assessment of confounding
Confounding
In statistics, a confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable...

 can be made. With rate data, OpenEpi provides adjusted rate ratio’s and rate differences, and tests for interaction
Interaction
Interaction is a kind of action that occurs as two or more objects have an effect upon one another. The idea of a two-way effect is essential in the concept of interaction, as opposed to a one-way causal effect...

. Finally, with count data, OpenEpi also performs a test for trend, for both crude data and stratified data.

In addition to being used to analyze data by health researchers, OpenEpi has been used as a training tool for teaching epidemiology to students at: Emory University, University of Massachusetts, University of Michigan, University of Minnesota, Morehouse College, Columbia University, University of Wisconsin, San Jose State University, University of Medicine and Dentistry of New Jersey, University of Washington, and elsewhere. This includes campus-based and distance learning courses. Because OpenEpi is easy to use, requires no programming experience, and can be run on the internet, students can use the program and focus on the interpretation of results.

Version 2.2 of OpenEpi was released Nov 11 2007 with the improvement of being able to run in English, French, Spanish, or Italian. Version 2.3 was released May 20 2009 and fixes a number of issues identified by users.

Comments and suggestions for improvements are welcomed and the developers respond to user queries. The developers encourage others to develop modules that could be added to OpenEpi and provide a developer’s tool at the website. Planned future development include improvements to existing modules, development of new modules, translation into other languages, and add the ability to cut and paste data and/or read data files.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK