A
norm-referenced test is a type of test, assessment, or
evaluationEvaluation is systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards.Evaluation often is used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts, criminal justice,...
which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. This estimate is derived from the analysis of test scores and possibly other relevant data from a
sampleIn statistics, a sample is a subset of a population. Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of manageable size...
drawn from the population. That is, this type of test identifies whether the test taker performed better or worse than other test takers, but not whether the test taker knows either more or less material than is necessary for a given purpose.
The term
normative assessment refers to the process of comparing one test-taker to his or her peers.
Norm-referenced assessment can be contrasted with criterion-referenced assessment and ipsative assessment. In a criterion-referenced assessment, the score shows whether or not the test takers performed well or poorly on a given task, but not how that compares to other test takers; in an ipsative system, the test taker is compared to his previous performance.
Other types
Alternative to normative testing, tests can be ipsative, that is, the individual assessment is compared to him- or herself through time.
By contrast, a test is
criterion-referencedA criterion-referenced test is one that provides for translating test scores into a statement about the behavior to be expected of a person with that score or their relationship to a specified subject matter. Most tests and quizzes written by school teachers are criterion-referenced tests. The...
when provision is made for translating the test score into a statement about the behavior to be expected of a person with that score. The same test can be used in both ways. Robert Glaser originally coined the terms
norm-referenced test and
criterion-referenced test.
Standards-based education reformEducation reform in the United States since the 1980s has been largely driven by the setting of academic standards for what students should know and be able to do. These standards can then be used to guide all other system components. The SBE reform movement calls for clear, measurable standards...
is based on the belief that public education should establish what every student should know and be able to do. Students should be tested against a fixed yardstick, rather than against each other or sorted into a mathematical
bell curveBell curve can refer to:* A Gaussian function, a specific kind of function whose graph is a bell-shaped curve* Normal distribution, whose density function is a Gaussian function...
. By assessing that every student must pass these new, higher standards, education officials believe that all students will achieve a diploma that prepares them for success in the 21st century.
Common use
Most state achievement tests are criterion referenced. In other words, a predetermined level of acceptable performance is developed and students pass or fail in achieving or not achieving this level. Tests that set goals for students based on the average student's performance are norm-referenced tests. Tests that set goals for students based on a set standard (e.g., 80 words spelled correctly) are criterion-referenced tests.
Many college entrance exams and nationally used school tests use norm-referenced tests. The
SATThe SAT Reasoning Test is a standardized test for college admissions in the United States. The SAT is owned, published, and developed by the College Board, a nonprofit organization in the United States. It was formerly developed, published, and scored by the Educational Testing Service which still...
,
Graduate Record ExaminationThe Graduate Record Examinations is a standardized test that is an admissions requirement for many graduate schools in the United States, in other English-speaking countries and for English-taught graduate and business programs world-wide...
(GRE), and
Wechsler Intelligence Scale for ChildrenThe Wechsler Intelligence Scale for Children , developed by Dr. David Wechsler, is an individually administered intelligence test for children between the ages of 6 and 16 inclusive that can be completed without reading or writing...
(WISC) compare individual student performance to the performance of a normative sample. Test-takers cannot "fail" a norm-referenced test, as each test-taker receives a score that compares the individual to others that have taken the test, usually given by a percentile. This is useful when there is a wide range of acceptable scores that is different for each college.
By contrast, nearly two-thirds of US high school students will be required to pass a criterion-referenced
high school graduation examinationA high school graduation examination is a test that students must pass to receive a diploma and graduate from high school. These are usually criterion-referenced tests which were implemented as part of a comprehensive standards-based education reform program which sets into place new standards...
. One high fixed score is set at a level adequate for university admission whether the high school graduate is college bound or not. Each state gives its own test and sets its own passing level, with states like Massachusetts showing very high pass rates, while in Washington State, even average students are failing, as well as 80 percent of some minority groups. This practice is opposed by many in the education community such as
Alfie KohnAlfie Kohn is an American author and lecturer who has explored a number of topics in education, parenting, and human behavior...
as unfair to groups and individuals who don't score as high as others.
Advantages and limitations
An obvious disadvantage of norm-referenced tests is that it cannot measure progress of the population as a whole, only where individuals fall within the whole. Thus, only measuring against a fixed goal can be used to measure the success of an educational reform program which seeks to raise the achievement of all students against new standards which seek to assess skills beyond choosing among multiple choices. However, while this is attractive in theory, in practice the bar has often been moved in the face of excessive failure rates, and improvement sometimes occurs simply because of familiarity with and teaching to the same test.
With a norm-referenced test, grade level was traditionally set at the level set by the middle 50 percent of scores. By contrast, the National Children's Reading Foundation believes that it is essential to assure that virtually all of our children read at or above grade level by third grade, a goal which cannot be achieved with a norm referenced definition of grade level.
Advantages to this type of assessment include students and teachers alike know what to expect from the test and just how the test will be conducted and graded. Likewise, each and every school will conduct the exam in the same manner reducing such inaccuracies as time differences or environmental differences that may cause distractions to the students. This also makes these assessments fairly accurate as far as results are concerned, a major advantage for a test.
Critics of criterion-referenced tests point out that judges set bookmarks around items of varying difficulty without considering whether the items actually are compliant with grade level content standards or are developmentally appropriate. Thus, the original 1997 sample problems published for the
WASLThe Washington Assessment of Student Learning was a standardized educational assessment system given as the primary assessment in the state of Washington from spring 1997 to summer 2009. The WASL was also used as a high school graduation examination beginning in the spring of 2006 and ending in 2009...
4th grade mathematics contained items that were difficult for college educated adults, or easily solved with 10th grade level methods such as similar triangles.
The difficulty level of items themselves, as are the cut-scores to determine passing levels are also changed from year to year. Pass rates also vary greatly from the 4th to the 7th and 10th grade graduation tests in some states.
One of the limitations of No Child Left Behind is that each state can choose or construct its own test which cannot be compared to any other state.
A Rand study of Kentucky results found indications of artificial inflation of pass rates which were not reflected in increasing scores in other tests such as the NAEP or SAT given to the same student populations over the same time.
Graduation test standards are typically set at a level consistent for native born 4 year university applicants . An unusual side effect is that while colleges often admit immigrants with very strong math skills who may be deficient in English, there is no such leeway in high school graduation tests, which usually require passing all sections, including language. Thus, it is not unusual for institutions like the
University of WashingtonUniversity of Washington is a public research university, founded in 1861 in Seattle, Washington, United States. The UW is the largest university in the Northwest and the oldest public university on the West Coast. The university has three campuses, with its largest campus in the University...
to admit strong Asian American or Latino students who did not pass the writing portion of the state WASL test, but such students would not even receive a diploma once the testing requirement is in place.
Although the tests such as the WASL are intended as a minimal bar for high school, 27 percent of 10th graders applying for
Running StartThe Running Start program in Washington state allows high school juniors and seniors to attend college courses numbered 100 or above, tuition-free, while completing high school. It is similar to dual enrollment programs common at public and private colleges and universities in other states...
in Washington State failed the math portion of the WASL. These students applied to take college level courses in high school, and achieve at a much higher level than average students. The same studyconcluded the level of difficulty was comparable to, or greater than that of tests intended to place students already admitted to the college.
A norm referenced test has none of these problems because it does not seek to enforce any expectation of what all students should know or be able to do other than what actual students demonstrate. Present levels of performance and inequity are taken as fact, not as defects to be removed by a redesigned system. Goals of student performance are not raised every year until all are proficient. Scores are not required to show continuous improvement through Total Quality Management systems. Disadvantages include standards based assessments measure the level that students are currently by measuring against where their peers are currently at instead of the level that both students should be at.
A rank-based system only produces data which tell which average students perform at an average level, which students do better, and which students do worse. This contradicts the fundamental beliefs, whether optimistic or simply unfounded, that all will perform at one uniformly high level in a standards based system if enough incentives and punishments are put into place. This difference in beliefs underlies the most significant differences between a traditional and a standards based education system.
See also
- Concept inventory
A concept inventory is a criterion-referenced test designed to evaluate whether a student has an accurate working knowledge of a specific set of concepts. To ensure interpretability, it is common to have multiple items that address a single idea...
- Criterion-referenced assessment
- Educational assessment
- Ipsative assessment
- Psychometrics
Psychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, personality traits, and educational measurement...
- Standardized test
A standardized test is a test that is administered and scored in a consistent, or "standard", manner. Standardized tests are designed in such a way that the questions, conditions for administering, scoring procedures, and interpretations are consistent and are administered and scored in a...
External links
- A webpage about instruction that discusses assessment