In
statistical surveyStatistical surveys are used to collect quantitative information about items in a population. Surveys of human populations and institutions are common in political polling and government, health, social science and marketing research. A survey may focus on opinions or factual information depending...
s conducted by means of
structured interviewA structured interview is a quantitative research method commonly employed in survey research. The aim of this approach is to ensure that each interviewee is presented with exactly the same questions in the same order...
s or questionnaires, a subset of the survey items having binary (e.g., YES or NO) answers forms a
Guttman scale if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that ordered scale. In other words, on a Guttman scale, items are arranged in an order so that an individual who agrees with a particular item also agrees with items of lower rank-order.
In
statistical surveyStatistical surveys are used to collect quantitative information about items in a population. Surveys of human populations and institutions are common in political polling and government, health, social science and marketing research. A survey may focus on opinions or factual information depending...
s conducted by means of
structured interviewA structured interview is a quantitative research method commonly employed in survey research. The aim of this approach is to ensure that each interviewee is presented with exactly the same questions in the same order...
s or questionnaires, a subset of the survey items having binary (e.g., YES or NO) answers forms a
Guttman scale if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that ordered scale. In other words, on a Guttman scale, items are arranged in an order so that an individual who agrees with a particular item also agrees with items of lower rank-order. For example, a series of items could be (1) "I am willing to be near ice cream"; (2) "I am willing to smell ice cream"; (3) "I am willing to eat ice cream"; and (4) "I love to eat ice cream". Agreement with any one item implies agreement with the lower-order items.
The concept of Guttman scale likewise applies to series of items in other kinds of tests, such as
achievement testAn achievement test is a test of developed skill or knowledge. The most common type of achievement test is a standardized test developed to measure skills and knowledge learned in a given grade level, usually through planned instruction, such as training or classroom instruction...
s, that have binary outcomes. For example, a test of math achievement might order questions based on their difficulty and instruct the examinee to begin in the middle. The assumption is if the examinee can successfully answer items of that difficulty (e.g., summing two 3-digit numbers), s/he would be able to answer the earlier questions (e.g., summing two 2-digit numbers). Some achievement tests are organized in a Guttman scale to reduce the duration of the test.
Another example is the original Beaufort wind force scale, assigning a single number to observed conditions of the sea surface ("Flat", ..., "Small waves", ..., "Sea heaps up and foam begins to streak", ...), was in fact a Guttman scale. The observation "Flat = YES" implies "Small waves = NO".
By designing surveys and tests such that they contain Guttman scales, researchers can simplify the analysis of the outcome of surveys, and increase the robustness. Guttman scales also make it possible to detect and discard randomized answer patterns, as may be given by uncooperative respondents.
A hypothetical, perfect Guttman scale consists of a unidimensional set of items that are ranked in order of difficulty from least extreme to most extreme position. For example, a person scoring a "7" on a ten item Guttman scale, will agree with items 1-7 and disagree with items 8,9,10. An important property of Guttman's model is that a person's entire set of responses to all items can be predicted from their cumulative score because the model is deterministic.
Deterministic model
An important objective in Guttman scaling is to maximize the reproducibility of response patterns from a single score. A good Guttman scale should have a coefficient of reproducibility (the percentage of original responses that could be reproduced by knowing the scale scores used to summarize them) above .85. Another commonly used metric for assessing the quality of a Guttman scale, is Menzel's coefficient of scalability and the coefficient of homogeneity (Loevinger, 1948; Cliff, 1977; Krus and Blackman, 1988). To maximize unidimensionality, misfitting items are re-written or discarded.
Stochastic models
Guttman's deterministic model is brought within a probabilistic framework in
item response theoryIn psychometrics, item response theory is a body of theory describing the application of mathematical models to data from questionnaires and tests as a basis for measuring abilities, attitudes, or other variables...
models, and especially
Rasch measurementRasch models are used for analysing data from assessments to measure things such as abilities, attitudes, and personality traits. For example, they may be used to estimate a student's reading ability from answers to questions on a reading assessment, or the extremity of a person's attitude to...
. The Rasch model requires a probabilistic Guttman structure when items have dichotomous responses (e.g. right/wrong). In the Rasch model, the Guttman response pattern is the most probable response pattern for a person when items are ordered from least difficult to most difficult (Andrich, 1985). In addition, the
Polytomous Rasch modelThe polytomous Rasch model is generalization of the dichotomous Rasch model. It is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers...
is premised on a deterministic
latent Guttman response subspace, and this is the basis for integer scoring in the model (Andrich, 1978, 2005). Analysis of data using item response theory requires comparatively longer instruments and larger datasets to scale item and person locations and evaluate the fit of data to model.
In practice, actual data from respondents do not closely match Guttman's deterministic model. Several probabilistic models of Guttman implicatory scales were developed by Krus (1977) and Krus and Bart (1974).
Applications
The Guttman scale is used mostly when researchers want to design short questionnaires with good discriminating ability. The Guttman model works best for constructs that are hierarchical and highly structured such as social distance, organizational hierarchies, and evolutionary stages.
Unfolding models
A class of unidimensional models that contrast with Guttman's model are unfolding models. These models also assume unidimensionality but posit that the probability of endorsing an item is proportional to the distance between the items standing on the unidimensional trait and the standing of the respondent. For example, items like "I think immigration should be reduced" on a scale measuring attitude towards immigration would be
unlikely to be endorsed
both by those favoring open policies
and also by those favoring no immigration at all. Such an item might be endorsed by someone in the middle of the continuum. Some researchers feel that many attitude items fit this unfolding model while most psychometric techniques are based on correlation or factor analysis, and thus implicitly assume a linear relationship between the trait and the response probability. The effect of using these techniques would be to only include the most extreme items, leaving attitude instruments with little precision to measure the trait standing of individuals in the middle of the continuum.
Example
Here is an example of a Guttman scale - the
Bogardus Social Distance ScaleThe Bogardus Social Distance Scale is a psychological testing scale created by Emory S. Bogardus to empirically measure people's willingness to participate in social contacts of varying degrees of closeness with members of diverse social groups, such as other racial and ethnic groups, sex...
:
(Least extreme)
- Are you willing to permit immigrants to live in your country?
- Are you willing to permit immigrants to live in your community?
- Are you willing to permit immigrants to live in your neighbourhood?
- Are you willing to permit immigrants to live next door to you?
- Would you permit your child to marry an immigrant?
(Most extreme)
E.g., agreement with item 3 implies agreement with items 1 and 2.
See also
- Bogardus Social Distance Scale
The Bogardus Social Distance Scale is a psychological testing scale created by Emory S. Bogardus to empirically measure people's willingness to participate in social contacts of varying degrees of closeness with members of diverse social groups, such as other racial and ethnic groups, sex...
-- A well known example of a Guttman scale
- Likert scale
A Likert scale is a psychometric scale commonly used in questionnaires, and is the most widely used scale in survey research. When responding to a Likert questionnaire item, respondents specify their level of agreement to a statement...
- Thurstone scale
In psychology, the Thurstone scale was the first formal technique for measuring an attitude. It was developed by Louis Leon Thurstone in 1928, as a means of measuring attitudes towards religion. It is made up of statements about a particular issue, and each statement has a numerical value...
- Mokken scale
- Diamond of opposites
The diamond of opposites is a type of two-dimensional plot used in psychodrama groups. This tool can illuminate the presence of contradictions in processes that cannot be detected by any single questionnaire item using a traditional format such as the Likert scale...
- Homogeneity (statistics)
In statistics, homogeneity arises in describing the properties of a dataset, or several datasets, and relates to the validity of the often convenient assumption that the statistical properties of any one part of an overall dataset are the same as any other part...
- Nonparametric Item Response Theory
External links