Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Relative risk

Relative risk

Overview
In statistics
Statistics
Statistics is a branch of mathematics concerned with collecting and interpreting data. According to other definitions, it is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Statisticians improve the quality of data with the...

 and mathematical epidemiology
Epidemiology
Epidemiology is the study of factors affecting the health and illness of populations, and serves as the foundation and logic of interventions made in the interest of public health and preventive medicine...

, relative risk (RR) is the risk of an event (or of developing a disease) relative to exposure. Relative risk is a ratio
Ratio
A ratio is an expression that compares quantities relative to each other. The most common examples involve two quantities, but any number of quantities can be compared. Ratios are represented mathematically by separating each quantity with a colon – for example, the ratio 2:3, which is read as the...

 of the probability
Probability
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy...

 of the event occurring in the exposed group versus a non-exposed group.
Consider an example where the probability
Probability
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy...

 of developing lung cancer among smokers was 20% and among non-smokers 1%. This situation is expressed in the 2 × 2 table to the right.
Here, a = 20(%), b = 80, c = 1, and d = 99.
Discussion
Ask a question about 'Relative risk'
Start a new discussion about 'Relative risk'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
In statistics
Statistics
Statistics is a branch of mathematics concerned with collecting and interpreting data. According to other definitions, it is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Statisticians improve the quality of data with the...

 and mathematical epidemiology
Epidemiology
Epidemiology is the study of factors affecting the health and illness of populations, and serves as the foundation and logic of interventions made in the interest of public health and preventive medicine...

, relative risk (RR) is the risk of an event (or of developing a disease) relative to exposure. Relative risk is a ratio
Ratio
A ratio is an expression that compares quantities relative to each other. The most common examples involve two quantities, but any number of quantities can be compared. Ratios are represented mathematically by separating each quantity with a colon – for example, the ratio 2:3, which is read as the...

 of the probability
Probability
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy...

 of the event occurring in the exposed group versus a non-exposed group.
Consider an example where the probability
Probability
Probability is a way of expressing knowledge or belief that an event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy...

 of developing lung cancer among smokers was 20% and among non-smokers 1%. This situation is expressed in the 2 × 2 table to the right.
Risk Disease status
Present Absent
Smk
Non-smk

Here, a = 20(%), b = 80, c = 1, and d = 99. Then the relative risk of cancer associated with smoking would be
Smokers would be twenty times as likely as non-smokers to develop lung cancer.

Another term for the relative risk is the risk ratio because it is the ratio of the risk in the exposed divided by the risk in the unexposed.

Statistical use and meaning


Relative risk is used frequently in the statistical analysis of binary outcomes where the outcome of interest has relatively low probability. It is thus often suited to clinical trial
Clinical trial
Clinical trials are conducted to allow safety and efficacy data to be collected for new drugs or devices. These trials can only take place once satisfactory information has been gathered on the quality of the product and its non-clinical safety, and Health Authority/Ethics Committee approval is...

 data, where it is used to compare the risk of developing a disease, in people not receiving the new medical treatment (or receiving a placebo) versus people who are receiving an established (standard of care) treatment. Alternatively, it is used to compare the risk of developing a side effect in people receiving a drug as compared to the people who are not receiving the treatment (or receiving a placebo). It is particularly attractive because it can be calculated by hand in the simple case, but is also susceptible to regression modelling
Regression analysis
In statistics, regression analysis includes any techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

, typically in a Poisson regression
Poisson regression
In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modelled by a linear combination of unknown...

 framework.

In a simple comparison between an experimental group and a control group:
  • A relative risk of 1 means there is no difference in risk between the two groups.
  • An RR of < 1 means the event is less likely to occur in the experimental group than in the control group.
  • An RR of > 1 means the event is more likely to occur in the experimental group than in the control group.


As a consequence of the Delta method, the log
Logarithm
In mathematics, the logarithm of a number to a given base is the power or exponent to which the base must be raised in order to produce the number....

 of the relative risk has a sampling distribution that is approximately normal
Normal distribution
In probability theory and statistics, the normal distribution or Gaussian distribution is a continuous probability distribution that describes data that cluster around a mean or average. The graph of the associated probability density function is bell-shaped, with a peak at the mean, and is known...

 with variance that can be estimated by a formula involving the number of subjects in each group and the event rates in each group (see Delta method) . This permits the construction of a confidence interval
Confidence interval
In statistics, a confidence interval is a particular kind of interval estimate of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given. Thus, confidence intervals are used to indicate the reliability of an estimate...

 (CI) which is symmetric around log(RR), i.e.,
where is the standard score
Standard score
In statistics, a standard score indicates how many standard deviations an observation is above or below the mean. It is a dimensionless quantity derived by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation...

 for the chosen level of significance
Statistical significance
In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....

 and SE the standard error
Standard error (statistics)
The standard error of a method of measurement or estimation is the standard deviation of the sampling distribution associated with the estimation method. The term may also be used to refer to an estimate of that standard deviation, derived from a particular sample used to compute the estimate.For...

. The antilog can be taken of the two bounds of the log-CI, giving the high and low bounds for an asymmetric confidence interval around the relative risk.

In regression models, the treatment is typically included as a dummy variable along with other factors that may affect risk. The relative risk is normally reported as calculated for the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

 of the sample values of the explanatory variables.

Association with odds ratio


Relative risk is different from the odds ratio
Odds ratio
The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

, although it asymptotically approaches it for small probabilities. In the example of association of smoking to lung cancer considered above, if a is substantially smaller than b, then a/(a + b) a/b. And if similarly is smaller enough than d, then c/(c + d) c/d. Thus
This is nothing else but the odds ratio.

In fact, the odds ratio has much wider use in statistics, since logistic regression
Logistic regression
In statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. It is a generalized linear model used for binomial regression...

, often associated with clinical trial
Clinical trial
Clinical trials are conducted to allow safety and efficacy data to be collected for new drugs or devices. These trials can only take place once satisfactory information has been gathered on the quality of the product and its non-clinical safety, and Health Authority/Ethics Committee approval is...

s, works with the log of the odds ratio, not relative risk. Because the log of the odds ratio is estimated as a linear function of the explanatory variables, the estimated odds ratio for 70-year-olds and 60-year-olds associated with type of treatment would be the same in a logistic regression models where the outcome is associated with drug and age, although the relative risk might be significantly different. In cases like this, statistical models of the odds ratio often reflect the underlying mechanisms more effectively.

Since relative risk is a more intuitive measure of effectiveness, the distinction is important especially in cases of medium to high probabilities. If action A carries a risk of 99.9% and action B a risk of 99.0% then the relative risk is just over 1, while the odds associated with action A are almost 10 times higher than the odds with B.

In medical research, the odds ratio
Odds ratio
The odds ratio is a measure of effect size, describing the strength of association or non-independence between two binary data values. It is used as a descriptive statistic, and plays an important role in logistic regression...

 is favoured for case-control studies and retrospective studies. Relative risk is used in randomized controlled trial
Randomized controlled trial
A randomized controlled trial is a type of scientific experiment most commonly used in testing the efficacy or effectiveness of healthcare services or health technologies . RCTs are also employed in other research areas, such as judicial, educational, and social research...

s and cohort studies
Cohort study
A cohort study or panel study is a form of longitudinal study used in medicine, social science and ecology. It is one type of study design and should be compared with a cross-sectional study....

.

In statistical modelling, approaches like poisson regression
Poisson regression
In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modelled by a linear combination of unknown...

 (for counts of events per unit exposure) have relative risk interpretations: the estimated effect of an explanatory variable is multiplicative on the rate, and thus leads to a risk ratio or relative risk. Logistic regression
Logistic regression
In statistics, logistic regression is used for prediction of the probability of occurrence of an event by fitting data to a logistic curve. It is a generalized linear model used for binomial regression...

 (for binary outcomes, or counts of successes out of a number of trials) must be interpreted in odds-ratio terms: the effect of an explanatory variable is multiplicative on the odds and thus leads to an odds ratio.

Statistical significance (confidence) and relative risk


Whether a given relative risk can be considered statistically significant
Statistical significance
In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....

 is dependent on the relative difference between the conditions compared, the amount of measurement and the noise associated with the measurement (of the events considered). In other words, the confidence one has, in a given relative risk being non-random (i.e. it is not a consequence of chance
Chance
Chance commonly refers to:* Probability* Luck* Randomness* Contingency * Chance Chance may also refer to:In people:* Chance In places:* Chancé, commune in Brittany, France...

), depends on the signal-to-noise ratio
Signal-to-noise ratio
Signal-to-noise ratio is an electrical engineering measurement, also used in other fields , defined as the ratio of a signal power to the noise power corrupting the signal...

 and the sample size.

Expressed mathematically, the confidence that a result is not by random chance is given by the following formula by Sackett
David Sackett
David Lawrence Sackett, OC, FRSC is a Canadian medical doctor and a pioneer in evidence-based medicine. He founded the first department of clinical epidemiology in Canada at McMaster University, and the Oxford Centre for Evidence-Based Medicine...

:
For clarity, the above formula is presented in tabular form below.

Dependence of confidence with noise, signal and sample size (tabular form)
Parameter Parameter increases Parameter decreases
Noise Confidence decreases Confidence increases
Signal Confidence increases Confidence decreases
Sample size Confidence increases Confidence decreases


In words, the confidence is higher if the noise is lower and/or the sample size is larger and/or the effect size (signal) is increased. The confidence of a relative risk value (and its associated confidence interval) is not dependent on effect size alone. If the sample size is large and the noise is low a small effect size can be measured with great confidence. Whether a small effect size is considered important is dependent on the context of the events compared.

In medicine, small effect sizes (reflected by small relative risk values) are usually considered clinically relevant (if there is great confidence in them) and are frequently used to guide treatment decisions. A relative risk of 1.10 may seem very small, but over a large number of patients will make a noticeable difference. Whether a given treatment is considered a worthy endeavour is dependent on the risks, benefits and costs.

Worked example


  • Example 3: Ratios are presented for each of experimental and control groups. In the disease-risk 2 × 2 table above, suppose a + c = 1 and b + d = 1 and the total number of patients and healthy people be m and n, respectively. Then prevalence ratio becomes p = m/(m + n). We can put q = m/n = p/(1 − p). Thus


If p is small enough, then q would be small enough and either of (b/d)q and (a/c)q would be small enough to be regarded as 0 compared with 1. RR would be reduced to the odd ratio as above.

Among Japanese, not a small fraction of patients of Behçet's disease are bestowed with a specific HLA type, namely HLA-B51 gene. In a survey, the proportion is 63% of the patients with this gene, while in healthy people the ratio is 21%. If the figures are considered to be representative for most Japanese, using the values of 12,700 patients in Japan in 1984 and the Japanese population about 120 million in 1982, then RR = 6.40. Compare with the odd ratio 6.41.

See also

  • Absolute risk reduction
    Absolute risk reduction
    In epidemiology, the absolute risk reduction is the decrease in risk of a given activity or treatment in relation to a control activity or treatment. It is the inverse of the number needed to treat....

  • (Population) attributable risk
    Attributable risk
    In epidemiology, attributable risk is the difference in rate of a condition between an exposed population and an unexposed population.The concept was first proposed by Levin in 1953.-Diversity of interpretation:...

  • Confidence interval
    Confidence interval
    In statistics, a confidence interval is a particular kind of interval estimate of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given. Thus, confidence intervals are used to indicate the reliability of an estimate...

  • Number needed to treat
    Number needed to treat
    The number needed to treat is an epidemiological measure used in assessing the effectiveness of a health-care intervention, typically a treatment with medication. The NNT is the number of patients who need to be treated in order to prevent one additional bad outcome...

     (NNT)
  • Number needed to harm
    Number needed to harm
    The number needed to harm is an epidemiological measure that indicates how many patients need to be exposed to a risk-factor to cause harm in one patient that would not otherwise have been harmed. It is defined as the inverse of the attributable risk...

     (NNH)
  • OpenEpi
    OpenEpi
    OpenEpi is a free, web-based, open source, operating system-independent series of programs for use in epidemiology, biostatistics, public health, and medicine, providing a number of epidemiologic and statistical tools for summary data. OpenEpi was developed in JavaScript and HTML, and can be run in...

  • Epi Info
    Epi Info
    Epi Info is public domain statistical software for epidemiology developed by Centers for Disease Control and Prevention in Atlanta, Georgia ....

  • The rare disease assumption
    The rare disease assumption
    The rare disease assumption is a useful mathematical assumption in epidemiologic case control studies where the hypothesis tests the association between an exposure and a disease. It is assumed that, if the prevalence of the disease is low, then the odds ratio approaches the relative risk.Case...


External links