Educational assessment is the process of documenting, usually in measurable terms, knowledge
Knowledge is a familiarity with someone or something unknown, which can include information, facts, descriptions, or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject...

, skill
A skill is the learned capacity to carry out pre-determined results often with the minimum outlay of time, energy, or both. Skills can often be divided into domain-general and domain-specific skills...

s, attitude
-Science and engineering:* Attitude as orientation of a geometric figure, such as a line, plane or rigid body* Attitude as strike or dip of a layer of rock in geology* Attitude of a wing or aircraft relative to airflow...

s and belief
Belief is the psychological state in which an individual holds a proposition or premise to be true.-Belief, knowledge and epistemology:The terms belief and knowledge are used differently in philosophy....

s. Assessment can focus on the individual learner, the learning community (class, workshop, or other organized group of learners), the institution, or the educational system as a whole. According to the Academic Exchange Quarterly: "Studies of a theoretical or empirical nature (including case studies, portfolio studies, exploratory, or experimental work) addressing the assessment of learner aptitude and preparation, motivation and learning styles, learning outcomes in achievement and satisfaction in different educational contexts are all welcome, as are studies addressing issues of measurable standards and benchmarks".

It is important to notice that the final purposes and assessment practices in education depends on the theoretical framework of the practitioners and researchers, their assumptions and beliefs about the nature of human mind, the origin of knowledge and the process of learning.

Alternate meanings

According to the Merriam-Webster online dictionary the word assessment comes from the root word assess which is defined as:
  1. to determine the rate or amount of (as a tax)
  2. to impose (as a tax) according to an established rate b: to subject to a tax, charge, or levy
  3. to make an official valuation of (property) for the purposes of taxation
  4. to determine the importance, size, or value of (assess a problem)
  5. to charge (a player or team) with a foul or penalty

Assessment in education is best described as an action "to determine the importance, size, or value of."


The term assessment is generally used to refer to all activities teachers use to help students learn and to gauge student progress. Though the notion of assessment is generally more complicated than the following categories suggest, assessment is often divided for the sake of convenience using the following distinctions:
  1. formative and summative
  2. objective and subjective
  3. referencing (criterion-referenced, norm-referenced, and ipsative
    Ipsative is a descriptor used in psychology to indicate a specific type of measure in which respondents compare two or more desirable options and pick the one that is most preferred . This is contrasted with measures that use Likert-type scales, in which respondents choose the score Ipsative is a...

  4. informal and formal.

Formative and summative

Assessment is often divided into formative and summative categories for the purpose of considering different objectives for assessment practices.
  • Summative assessment
    Summative Assessment
    Summative assessment refers to the assessment of the learning and summarizes the development of learners at a particular time. After a period of work, e.g. a unit for two weeks, the learner sits for a test and then the teacher marks the test and assigns a score. The test aims to summarize learning...

     - Summative assessment is generally carried out at the end of a course or project. In an educational setting, summative assessments are typically used to assign students a course grade. Summative assessments are evaluative.
  • Formative assessment
    Formative assessment
    Formative assessment is a range of formal and informal assessment procedures employed by teachers during the learning process in order to modify teaching and learning activities to improve student attainment. It typically involves qualitative feedback for both student and teacher that focuses on...

     - Formative assessment is generally carried out throughout a course or project. Formative assessment, also referred to as "educative assessment," is used to aid learning. In an educational setting, formative assessment might be a teacher (or peer
    Peer group
    A peer group is a social group consisting of humans. Peer groups are an informal primary group of people who share a similar or equal status and who are usually of roughly the same age, tended to travel around and interact within the social aggregate Members of a particular peer group often have...

    ) or the learner, providing feedback on a student's work, and would not necessarily be used for grading purposes. Formative assessments are diagnostic.

Educational researcher Robert Stake explains the difference between formative and summative assessment with the following analogy:

Summative and formative assessment are often referred to in a learning context as assessment of learning and assessment for learning respectively. Assessment of learning is generally summative in nature and intended to measure learning outcomes and report those outcomes to students, parents, and administrators. Assessment of learning generally occurs at the conclusion of a class, course, semester, or academic year. Assessment for learning is generally formative in nature and is used by teachers to consider approaches to teaching and next steps for individual learners and the class.

A common form of formative assessment is diagnostic assessment. Diagnostic assessment measures a student's current knowledge and skills for the purpose of identifying a suitable program of learning. Self-assessment is a form of diagnostic assessment which involves students assessing themselves. Forward-looking assessment asks those being assessed to consider themselves in hypothetical future situations.

Performance-based assessment is similar to summative assessment, as it focuses on achievement. It is often aligned with the standards-based education reform
Standards-based education reform
Education reform in the United States since the 1980s has been largely driven by the setting of academic standards for what students should know and be able to do. These standards can then be used to guide all other system components. The SBE reform movement calls for clear, measurable standards...

 and outcomes-based education movement. Though ideally they are significantly different from a traditional multiple choice test, they are most commonly associated with standards-based assessment
Standards-based assessment
A standards based test is one based on the outcome-based education or performance-based education philosophy. Assessment is a key part of the standards reform movement. The first part is to set new, higher standards to be expected of every student. Then the curriculum must be aligned to the new...

 which use free-form responses to standard questions scored by human scorers on a standards-based scale, meeting, falling below, or exceeding a performance standard rather than being ranked on a curve. A well-defined task is identified and students are asked to create, produce, or do something, often in settings that involve real-world application of knowledge and skills. Proficiency is demonstrated by providing an extended response. Performance formats are further differentiated into products and performances. The performance may result in a product, such as a painting, portfolio, paper, or exhibition, or it may consist of a performance, such as a speech, athletic skill, musical recital, or reading.

Objective and subjective

Assessment (either summative or formative) is often categorized as either objective or subjective. Objective assessment is a form of questioning which has a single correct answer. Subjective assessment is a form of questioning which may have more than one correct answer (or more than one way of expressing the correct answer). There are various types of objective and subjective questions. Objective question types include true/false answers, multiple choice
Multiple choice
Multiple choice is a form of assessment in which respondents are asked to select the best possible answer out of the choices from a list. The multiple choice format is most frequently used in educational testing, in market research, and in elections-- when a person chooses between multiple...

, multiple-response and matching questions. Subjective questions include extended-response questions and essays. Objective assessment is well suited to the increasingly popular computerized or online assessment
In its broadest sense, e-assessment is the use of information technology for any assessment-related activity. This definition embraces a wide range of student activity ranging from the use of a word processor to on-screen testing...


Some have argued that the distinction between objective and subjective assessments is neither useful nor accurate because, in reality, there is no such thing as "objective" assessment. In fact, all assessments are created with inherent biases built into decisions about relevant subject matter and content, as well as cultural (class, ethnic, and gender) biases.

Basis of comparison

Test results can be compared against an established criterion, or against the performance of other students, or against previous performance:

Criterion-referenced assessment, typically using a criterion-referenced test
Criterion-referenced test
A criterion-referenced test is one that provides for translating test scores into a statement about the behavior to be expected of a person with that score or their relationship to a specified subject matter. Most tests and quizzes written by school teachers are criterion-referenced tests. The...

, as the name implies, occurs when candidates are measured against defined (and objective) criteria. Criterion-referenced assessment is often, but not always, used to establish a person's competence (whether s/he can do something). The best known example of criterion-referenced assessment is the driving test, when learner drivers are measured against a range of explicit criteria (such as "Not endangering other road users").

Norm-referenced assessment (colloquially known as "grading on the curve
Bell curve grading
In education, grading on a curve is a statistical method of assigning grades designed to yield a pre-determined distribution of grades among the students in a class...

"), typically using a norm-referenced test
Norm-referenced test
A norm-referenced test is a type of test, assessment, or evaluation which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. This estimate is derived from the analysis of test scores and possibly other relevant data...

, is not measured against defined criteria. This type of assessment is relative to the student body undertaking the assessment. It is effectively a way of comparing students. The IQ test is the best known example of norm-referenced assessment. Many entrance tests (to prestigious schools or universities) are norm-referenced, permitting a fixed proportion of students to pass ("passing" in this context means being accepted into the school or university rather than an explicit level of ability). This means that standards may vary from year to year, depending on the quality of the cohort; criterion-referenced assessment does not vary from year to year (unless the criteria change).

Ipsative assessment is self comparison either in the same domain over time, or comparative to other domains within the same student.

Informal and formal

Assessment can be either formal or informal. Formal assessment usually implies a written document, such as a test, quiz, or paper. A formal assessment is given a numerical score or grade based on student performance, whereas an informal assessment does not contribute to a student's final grade such as this copy and pasted discussion question. An informal assessment usually occurs in a more casual manner and may include observation, inventories, checklists, rating scales, rubrics
Rubric (academic)
A rubric is an assessment tool for communicating expectations of quality. Rubrics support student self-reflection and self-assessment as well as communication between assessor and assessees...

, performance and portfolio assessments, participation, peer and self evaluation, and discussion.

Internal and external

Internal assessment is set and marked by the school (i.e. teachers). Students get the mark and feedback regarding the assessment. External assessment is set by the governing body, and is marked by non-biased personnel. With external assessment, students only receive a mark. Therefore, they have no idea how they actually performed (i.e. which questions they answered correctly.)

Standards of quality

In general, high-quality assessments are considered those with a high level of reliability
Reliability (statistics)
In statistics, reliability is the consistency of a set of measurements or of a measuring instrument, often used to describe a test. Reliability is inversely related to random error.-Types:There are several general classes of reliability estimates:...

 and validity
Validity (statistics)
In science and statistics, validity has no single agreed definition but generally refers to the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong...

. Approaches to reliability and validity vary, however.


Reliability (statistics)
In statistics, reliability is the consistency of a set of measurements or of a measuring instrument, often used to describe a test. Reliability is inversely related to random error.-Types:There are several general classes of reliability estimates:...

 relates to the consistency of an assessment. A reliable assessment is one which consistently achieves the same results with the same (or similar) cohort of students. Various factors affect reliability—including ambiguous questions, too many options within a question paper, vague marking instructions and poorly trained markers. Traditionally, the reliability of an assessment is based on the following:
  1. Temporal stability: Performance on a test is comparable on two or more separate occasions.
  2. Form equivalence: Performance among examinees is equivalent on different forms of a test based on the same content.
  3. Internal consistency: Responses on a test are consistent across questions. For example: In a survey that asks respondents to rate attitudes toward technology, consistency would be expected in responses to the following questions:
    • "I feel very negative about computers in general."
    • "I enjoy using computers."

Reliability can also be expressed in mathematical terms as:
Rx = VT/Vx where Rx is the reliability in the observed (test) score, X;
Vt and Vx are the variability in ‘true’ (i.e., candidate’s innate performance) and measured test scores respectively. The Rx can range from 0 (completely unreliable), to 1 (completely reliable). An Rx of 1 is rarely achieved, and an Rx of 0.8 is generally considered reliable.


A valid
Validity (statistics)
In science and statistics, validity has no single agreed definition but generally refers to the extent to which a concept, conclusion or measurement is well-founded and corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong...

 assessment is one which measures what it is intended to measure. For example, it would not be valid to assess driving skills through a written test alone. A more valid way of assessing driving skills would be through a combination of tests that help determine what a driver knows, such as through a written test of driving knowledge, and what a driver is able to do, such as through a performance assessment of actual driving. Teachers frequently complain that some examinations do not properly assess the syllabus
A syllabus , is an outline and summary of topics to be covered in an education or training course. It is descriptive...

 upon which the examination is based; they are, effectively, questioning the validity of the exam.

Validity of an assessment is generally gauged through examination of evidence in the following categories:
  1. Content – Does the content of the test measure stated objectives?
  2. Criterion – Do scores correlate to an outside reference? (ex: Do high scores on a 4th grade reading test accurately predict reading skill in future grades?)
  3. Construct – Does the assessment correspond to other significant variables? (ex: Do ESL students consistently perform differently on a writing exam than native English speakers?)
  4. Face – Does the item or theory make sense, and is it seemingly correct to the expert reader?

A good assessment has both validity and reliability, plus the other quality attributes noted above for a specific context and purpose. In practice, an assessment is rarely totally valid or totally reliable. A ruler which is marked wrong will always give the same (wrong) measurements. It is very reliable, but not very valid. Asking random individuals to tell the time without looking at a clock or watch is sometimes used as an example of an assessment which is valid, but not reliable. The answers will vary between individuals, but the average answer is probably close to the actual time. In many fields, such as medical research, educational testing, and psychology, there will often be a trade-off between reliability and validity. A history test written for high validity will have many essay and fill-in-the-blank questions. It will be a good measure of mastery of the subject, but difficult to score completely accurately. A history test written for high reliability will be entirely multiple choice. It isn't as good at measuring knowledge of history, but can easily be scored with great precision. We may generalize from this. The more reliable our estimate is of what we purport to measure, the less certain we are that we are actually measuring that aspect of attainment. It is also important to note that there are at least thirteen sources of invalidity, which can be estimated for individual students in test situations. They never are. Perhaps this is because their social purpose demands the absence of any error, and validity errors are usually so high that they would destabilize the whole assessment industry.

It is well to distinguish between "subject-matter" validity and "predictive" validity. The former, used widely in education, predicts the score a student would get on a similar test but with different questions. The latter, used widely in the workplace, predicts performance. Thus, a subject-matter-valid test of knowledge of driving rules is appropriate while a predictively-valid test would assess whether the potential driver could follow those rules.

Testing standards

In the field of psychometrics
Psychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, personality traits, and educational measurement...

, the Standards for Educational and Psychological Testing
Standards for Educational and Psychological Testing
The Standards for Educational and Psychological Testing is a set of testing standards developed jointly by the American Educational Research Association , American Psychological Association , and the National Council on Measurement in Education...

 place standards about validity and reliability, along with errors of measurement and related considerations under the general topic of test construction, evaluation and documentation. The second major topic covers standards related to fairness in testing, including fairness
Justice is a concept of moral rightness based on ethics, rationality, law, natural law, religion, or equity, along with the punishment of the breach of said ethics; justice is the act of being just and/or fair.-Concept of justice:...

 in testing and test use, the right
Rights are legal, social, or ethical principles of freedom or entitlement; that is, rights are the fundamental normative rules about what is allowed of people or owed to people, according to some legal system, social convention, or ethical theory...

s and responsibilities
Social responsibility
Social responsibility is an ethical ideology or theory that an entity, be it an organization or individual, has an obligation to act to benefit society at large. Social responsibility is a duty every individual or organization has to perform so as to maintain a balance between the economy and the...

 of test takers, testing individuals of diverse linguistic backgrounds
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...

, and testing individuals with disabilities
A disability may be physical, cognitive, mental, sensory, emotional, developmental or some combination of these.Many people would rather be referred to as a person with a disability instead of handicapped...

. The third and final major topic covers standards related to testing applications, including the responsibilities of test users, psychological testing and assessment
Psychological testing
Psychological testing is a field characterized by the use of samples of behavior in order to assess psychological construct, such as cognitive and emotional functioning, about a given individual. The technical term for the science behind psychological testing is psychometrics...

, educational testing and assessment, testing in employment
Employment is a contract between two parties, one being the employer and the other being the employee. An employee may be defined as:- Employee :...

 and credentialing
Professional certification
Professional certification, trade certification, or professional designation, often called simply certification or qualification, is a designation earned by a person to assure qualification to perform a job or task...

, plus testing in program evaluation
Program evaluation
Project evaluation is a systematic method for collecting, analyzing, and using information to answer questions about projects, policies and programs, particularly about their effectiveness and efficiency...

 and public policy
Standardized testing and public policy
Standardized testing is used as a public policy strategy to establish stronger accountability measures for public education. While the National Assessment of Education Progress has served as an educational barometer for some thirty years by administering standardized tests on a regular basis to...


Evaluation standards

In the field of evaluation
Evaluation is systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards.Evaluation often is used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts, criminal justice,...

, and in particular educational evaluation
Educational evaluation
Educational evaluation is the evaluation process of characterizing and appraising some aspect/s of an educational process.Q. 3 Discuss the role of standards and criteria in educational evaluation...

, the Joint Committee on Standards for Educational Evaluation
Joint Committee on Standards for Educational Evaluation
The Joint Committee on Standards for Educational Evaluation is an American/Canadian based Standards Developer Organization . The Joint Committee represents a coalition of major professional associations formed in 1975 to help improve the quality of standardized evaluation. The Committee has thus...

 has published three sets of standards for evaluations. "The Personnel Evaluation Standards" was published in 1988, The Program Evaluation Standards (2nd edition) was published in 1994, and The Student Evaluation Standards was published in 2003.

Each publication presents and elaborates a set of standards for use in a variety of educational settings. The standards provide guidelines for designing, implementing, assessing and improving the identified form of evaluation. Each of the standards has been placed in one of four fundamental categories to promote educational evaluations that are proper, useful, feasible, and accurate. In these sets of standards, validity and reliability considerations are covered under the accuracy topic. For example, the student accuracy standards help ensure that student evaluations will provide sound, accurate, and credible information about student learning and performance.

Summary table of the main theoretical frameworks

The following table summarizes the main theoretical frameworks behind almost all the theoretical and research work, and the instructional practices in education (one of them being, of course, the practice of assessment). These different frameworks have given rise to interesting debates among scholars.
Philosophical orientation Hume
David Hume
David Hume was a Scottish philosopher, historian, economist, and essayist, known especially for his philosophical empiricism and skepticism. He was one of the most important figures in the history of Western philosophy and the Scottish Enlightenment...

: British empiricism
Empiricism is a theory of knowledge that asserts that knowledge comes only or primarily via sensory experience. One of several views of epistemology, the study of human knowledge, along with rationalism, idealism and historicism, empiricism emphasizes the role of experience and evidence,...

Immanuel Kant
Immanuel Kant was a German philosopher from Königsberg , researching, lecturing and writing on philosophy and anthropology at the end of the 18th Century Enlightenment....

, Descartes
René Descartes
René Descartes ; was a French philosopher and writer who spent most of his adult life in the Dutch Republic. He has been dubbed the 'Father of Modern Philosophy', and much subsequent Western philosophy is a response to his writings, which are studied closely to this day...

: Continental rationalism
In epistemology and in its modern sense, rationalism is "any view appealing to reason as a source of knowledge or justification" . In more technical terms, it is a method or a theory "in which the criterion of the truth is not sensory but intellectual and deductive"...

Georg Wilhelm Friedrich Hegel
Georg Wilhelm Friedrich Hegel was a German philosopher, one of the creators of German Idealism. His historicist and idealist account of reality as a whole revolutionized European philosophy and was an important precursor to Continental philosophy and Marxism.Hegel developed a comprehensive...

, Marx
Karl Marx
Karl Heinrich Marx was a German philosopher, economist, sociologist, historian, journalist, and revolutionary socialist. His ideas played a significant role in the development of social science and the socialist political movement...

: cultural dialectic
Dialectic is a method of argument for resolving disagreement that has been central to Indic and European philosophy since antiquity. The word dialectic originated in Ancient Greece, and was made popular by Plato in the Socratic dialogues...

Metaphorical Orientation Mechanistic/Operation of a Machine or Computer Organismic/Growth of a Plant Contextualist/Examination of a Historical Event
Leading Theorists B. F. Skinner
B. F. Skinner
Burrhus Frederic Skinner was an American behaviorist, author, inventor, baseball enthusiast, social philosopher and poet...

Behaviorism , also called the learning perspective , is a philosophy of psychology based on the proposition that all things that organisms do—including acting, thinking, and feeling—can and should be regarded as behaviors, and that psychological disorders are best treated by altering behavior...

)/ Herb Simon, John Anderson
John H. D. Anderson
John Anderson was a Scottish natural philosopherand liberal educator at the forefront of the application of science to technology in the industrial revolution, and of the education and advancement of working men and women....

, Robert Gagné
Robert M. Gagné
Robert Mills Gagné was an American educational psychologist best known for his "Conditions of Learning". Gagné pioneered the science of instruction during WWII for the air force with pilot training...

: (cognitivism
Cognitivism (psychology)
In psychology, cognitivism is a theoretical framework for understanding the mind that came into usage in the 1950s. The movement was a response to behaviorism, which cognitivists said neglected to explain cognition...

Jean Piaget
Jean Piaget
Jean Piaget was a French-speaking Swiss developmental psychologist and philosopher known for his epistemological studies with children. His theory of cognitive development and epistemological view are together called "genetic epistemology"....

/Robbie Case
Lev Vygotsky
Lev Vygotsky
Lev Semyonovich Vygotsky was a Soviet psychologist, the founder of cultural-historical psychology, and the leader of the Vygotsky Circle.-Biography:...

, Luria
Alexander Luria
Alexander Romanovich Luria was a famous Soviet neuropsychologist and developmental psychologist. He was one of the founders of neuropsychology and the jointly led the Vygotsky Circle.- Biography :...

, Bruner
Jerome Bruner
Jerome Seymour Bruner is an American psychologist who has contributed to cognitive psychology and cognitive learning theory in educational psychology, as well as to history and to the general philosophy of education. Bruner is currently a senior research fellow at the New York University School...

/Alan Collins, Jim Greeno, Ann Brown
Ann Brown
Ann Leslie Brown was an educational psychologist who developed methods for teaching children to be better learners. Her interest in the human memory brought Brown to focus on active memory strategies that would help enhance human memory and developmental differences in memory tasks...

, John Bransford
John D. Bransford
Dr. John D. Bransford holds the Shauna C. LarsonUniversity Professor of Education and Psychology at the University of Washington in Seattle. Dr. Bransford is also Co-Principal Investigator and Director of , an ....

Nature of Mind Initially blank device that detects patterns in the world and operates on them. Qualitatively identical to lower animals, but quantitatively superior. Organ that evolved to acquire knowledge by making sense of the world. Uniquely human, qualitatively different from lower animals. Unique among species for developing language, tools, and education.
Nature of Knowledge
Hierarchically organized associations that present an accurate but incomplete representation of the world. Assumes that the sum of the components of knowledge is the same as the whole. Because knowledge is accurately represented by components, one who demonstrates those components is presumed to know General and/or specific cognitive and conceptual structures, constructed by the mind and according to rational criteria. Essentially these are the higher-level structures that are constructed to assimilate new info to existing structure and as the structures accommodate more new info. Knowledge is represented by ability to solve new problems. Distributed across people, communities, and physical environment. Represents culture of community that continues to create it. To know means to be attuned to the constraints and affordances of systems in which activity occurs. Knowledge is represented in the regularities of successful activity.
Nature of Learning (the process by which knowledge is increased or modified) Forming and strengthening cognitive or S-R associations. Generation of knowledge by (1) exposure to pattern, (2) efficiently recognizing and responding to pattern (3) recognizing patterns in other contexts. Engaging in active process of making sense of ("rationalizing") the environment. Mind applying existing structure to new experience to rationalize it. You don't really learn the components, only structures needed to deal with those components later. Increasing ability to participate in a particular community of practice. Initiation into the life of a group, strengthening ability to participate by becoming attuned to constraints and affordances.
Features of Authentic Assessment Assess knowledge components. Focus on mastery of many components and fluency. Use psychometrics to standardize. Assess extended performance on new problems. Credit varieties of excellence. Assess participation in inquiry and social practices of learning (e.g. portfolios, observations) Students should participate in assessment process. Assessments should be integrated into larger environment.


Concerns over how best to apply assessment practices across public school systems have largely focused on questions about the use of high stakes testing and standardized tests, often used to gauge student progress, teacher quality, and school-, district-, or state-wide educational success.

No Child Left Behind

For most researchers and practitioners, the question is not whether tests should be administered at all—there is a general consensus that, when administered in useful ways, tests can offer useful information about student progress and curriculum implementation, as well as offering formative uses for learners. The real issue, then, is whether testing practices as currently implemented can provide these services for educators and students.

In the U.S., the No Child Left Behind Act
No Child Left Behind Act
The No Child Left Behind Act of 2001 is a United States Act of Congress concerning the education of children in public schools.NCLB was originally proposed by the administration of George W. Bush immediately after he took office...

 mandates standardized testing nationwide. These tests align with state curriculum and link teacher, student, district, and state accountability to the results of these tests. Proponents of NCLB argue that it offers a tangible method of gauging educational success, holding teachers and schools accountable for failing scores, and closing the achievement gap
Achievement gap
Achievement gap refers to the observed disparity on a number of educational measures between the performance of groups of students, especially groups defined by gender, race/ethnicity, and socioeconomic status. The achievement gap can be observed on a variety of measures, including standardized...

 across class and ethnicity.

Opponents of standardized testing dispute these claims, arguing that holding educators accountable for test results leads to the practice of "teaching to the test
Teaching to the test
Teaching to the test is an educational practice where the curriculum is centered primarily around an end assessment or standardized test. The practice is designed to give students a set range of knowledge or skills that will allow them to enhance their performance on tests...

." Additionally, many argue that the focus on standardized testing encourages teachers to equip students with a narrow set of skills that enhance test performance without actually fostering a deeper understanding of subject matter or key principles within a knowledge domain.

High-stakes testing

The assessments which have caused the most controversy in the U.S. are the use of high school graduation examination
High school graduation examination
A high school graduation examination is a test that students must pass to receive a diploma and graduate from high school. These are usually criterion-referenced tests which were implemented as part of a comprehensive standards-based education reform program which sets into place new standards...

s, which are used to deny diplomas to students who have attended high school for four years, but cannot demonstrate that they have learned the required material. Opponents say that no student who has put in four years of seat time should be denied a high school diploma merely for repeatedly failing a test, or even for not knowing the required material.

High-stakes tests have been blamed for causing sickness and test anxiety
Test anxiety
Test anxiety involves a combination of physiological over-arousal, worry and dread about test performance, and often interferes with normal learning and lowers test performance...

 in students and teachers, and for teachers choosing to narrow the curriculum towards what the teacher believes will be tested. In an exercise designed to make children comfortable about testing, a Spokane, Washington newspaper published a picture of a monster
A monster is any fictional creature, usually found in legends or horror fiction, that is somewhat hideous and may produce physical harm or mental fear by either its appearance or its actions...

 that feeds on fear. The published image is purportedly the response of a student who was asked to draw a picture of what she thought of the state assessment.

Other critics, such as Washington State University's Don Orlich
Don Orlich
Don Orlich is professor emeritus of the Science Mathematics EngineeringEducation Center at Washington State University. He has published morethan 100 professional papers, co-authored more than 30 monographs and books, and is the senior co-author of “Teaching Strategies: A guide to Effective...

, question the use of test items far beyond standard cognitive levels for students' age.

Compared to portfolio assessments, simple multiple-choice tests are much less expensive, less prone to disagreement between scorers, and can be scored quickly enough to be returned before the end of the school year. Standardized test
Standardized test
A standardized test is a test that is administered and scored in a consistent, or "standard", manner. Standardized tests are designed in such a way that the questions, conditions for administering, scoring procedures, and interpretations are consistent and are administered and scored in a...

s (all students take the same test under the same conditions) often use multiple-choice tests for these reasons. Orlich criticizes the use of expensive, holistically graded tests, rather than inexpensive multiple-choice "bubble tests", to measure the quality of both the system and individuals for very large numbers of students. Other prominent critics of high-stakes testing include Fairtest
The National Center for Fair & Open Testing, also known as FairTest, is an American educational organization that addresses issues related to accuracy in student test taking and scoring.-SAT optional schools:...

 and Alfie Kohn
Alfie Kohn
Alfie Kohn is an American author and lecturer who has explored a number of topics in education, parenting, and human behavior...


The use of IQ tests has been banned in some states for educational decisions, and norm-referenced tests, which rank students from "best" to "worst", have been criticized for bias against minorities. Most education officials support criterion-referenced tests (each individual student's score depends solely on whether he answered the questions correctly, regardless of whether his neighbors did better or worse) for making high-stakes decisions.

21st century assessment

It has been widely noted that with the emergence of social media
Social media
The term Social Media refers to the use of web-based and mobile technologies to turn communication into an interactive dialogue. Andreas Kaplan and Michael Haenlein define social media as "a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0,...

 and Web 2.0
Web 2.0
The term Web 2.0 is associated with web applications that facilitate participatory information sharing, interoperability, user-centered design, and collaboration on the World Wide Web...

 technologies and mindsets, learning is increasingly collaborative and knowledge increasingly distributed across many members of a learning community. Traditional assessment practices, however, focus in large part on the individual and fail to account for knowledge-building and learning in context. As researchers in the field of assessment consider the cultural shifts that arise from the emergence of a more participatory culture
Participatory culture
Participatory culture is a neologism in reference of, but opposite to a Consumer culture — in other words a culture in which private persons do not act as consumers only, but also as contributors or producers . The term is most often applied to the production or creation of some type of published...

, they will need to find new methods of applying assessments to learners.

Assessment in a democratic school

Sudbury model of democratic education schools do not perform and do not offer assessments, evaluations, transcripts, or recommendations, asserting that they do not rate people, and that school is not a judge; comparing students to each other, or to some standard that has been set is for them a violation of the student's right to privacy and to self-determination. Students decide for themselves how to measure their progress as self-starting learners as a process of self-evaluation: real life-long learning and the proper educational assessment for the 21st century, they adduce.

According to Sudbury schools, this policy does not cause harm to their students as they move on to life outside the school. However, they admit it makes the process more difficult, but that such hardship is part of the students learning to make their own way, set their own standards and meet their own goals.

The no-grading and no-rating policy helps to create an atmosphere free of competition among students or battles for adult approval, and encourages a positive cooperative environment amongst the student body.

The final stage of a Sudbury education, should the student choose to take it, is the graduation thesis. Each student writes on the topic of how they have prepared themselves for adulthood and entering the community at large. This thesis is submitted to the Assembly, who reviews it. The final stage of the thesis process is an oral defense given by the student in which they open the floor for questions, challenges and comments from all Assembly members. At the end, the Assembly votes by secret ballot on whether or not to award a diploma.

See also

  • Adaptive comparative judgement
    Adaptive comparative judgement
    Adaptive Comparative Judgement is a technique borrowed from psychophysics which is able to generate reliable results for educational assessment - as such it is an alternative to traditional exam script marking. In the approach judges are presented with pairs of student work and are then asked to...

  • Computer aided assessment
    Computer aided assessment
    Computer aided assessment is a term that covers all forms of assessment, whether Summative or Formative , delivered with the help of computers...

  • Concept inventory
    Concept inventory
    A concept inventory is a criterion-referenced test designed to evaluate whether a student has an accurate working knowledge of a specific set of concepts. To ensure interpretability, it is common to have multiple items that address a single idea...

  • Confidence-Based Learning
    Confidence-based learning
    Confidence-Based Learning or CBL is a methodology used in learning and training that measures a learner's knowledge quality by determining both the correctness of the learner's knowledge and confidence in that knowledge. Additionally, the CBL process is designed to increase retention and minimize...

     accurately measures a learner's knowledge quality by measuring both the correctness of his or her knowledge and the person's confidence in that knowledge.
  • E-scape
    E-scape is a project run by the Technology Education Research Unit at Goldsmiths University of London, England that developed an approach to the authentic assessment of creativity and collaboration based on open-ended but structured activities...

    , a technology and approach that looks specifically at the assessment of creativity and collaboration.
  • Educational evaluation
    Educational evaluation
    Educational evaluation is the evaluation process of characterizing and appraising some aspect/s of an educational process.Q. 3 Discuss the role of standards and criteria in educational evaluation...

     deals specifically with evaluation as it applies to an educational setting. As an example it may be used in the No Child Left Behind (NCLB) government program instituted by the government
    Government refers to the legislators, administrators, and arbitrators in the administrative bureaucracy who control a state at a given time, and to the system of government by which they are organized...

     of the U.S.
  • Electronic portfolio
    Electronic portfolio
    An electronic portfolio, also known as an e-portfolio or digital portfolio, is a collection of electronic evidence assembled and managed by a user, usually on the Web. Such electronic evidence may include inputted text, electronic files, images, multimedia, blog entries, and hyperlinks...

     is a personal digital record containing information such as a collection of artifacts or evidence demonstrating what one knows and can do.
  • Evaluation
    Evaluation is systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards.Evaluation often is used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts, criminal justice,...

     is the process of looking at what is being assessed to make sure the right areas are being considered.
  • Grading
    Grade (education)
    Grades are standardized measurements of varying levels of comprehension within a subject area. Grades can be assigned in letters , as a range , as a number out of a possible total , as descriptors , in percentages, or, as is common in some post-secondary...

     is the process of assigning a (possibly mutually exclusive) ranking to learners.
  • Health Impact Assessment
    Health Impact Assessment
    Health Impact Assessment is defined as "a combination of procedures, methods and tools bywhich a policy, program or project may be judged as to its potential effects on the...

     looks at the potential health impacts of policies, programs and projects.
  • Measurement
    Measurement is the process or the result of determining the ratio of a physical quantity, such as a length, time, temperature etc., to a unit of measurement, such as the metre, second or degree Celsius...

     is a process of assessment or an evaluation in which the objective is to quantify level of attainment or competence within a specified domain. See the Rasch model
    Rasch model
    Rasch models are used for analysing data from assessments to measure variables such as abilities, attitudes, and personality traits. For example, they may be used to estimate a student's reading ability from answers to questions on a reading assessment, or the extremity of a person's attitude to...

     for measurement for elaboration on the conceptual requirements of such processes, including those pertaining to grading and use of raw scores from assessments.
  • Program evaluation
    Program evaluation
    Project evaluation is a systematic method for collecting, analyzing, and using information to answer questions about projects, policies and programs, particularly about their effectiveness and efficiency...

     is essentially a set of philosophies and techniques to determine if a program "works".
  • Psychometrics
    Psychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, personality traits, and educational measurement...

    , the science of measuring psychological characteristics.
  • Rubrics for assessment
  • Science, Technology, Society and Environment Education
    Science, technology, society and environment education
    Science, technology, society and environment education, originates from the science technology and society movement in science education. This is an outlook on science education that emphasizes the teaching of scientific and technological developments in their cultural, economic, social and...

  • Social Impact Assessment
    Social impact assessment
    Social impact assessment is a methodology to review the social effects of infrastructure projects and other development interventions.-Definition:...

     looks at the possible social impacts of proposed new infrastructure projects, natural resource projects, or development activities.
  • Standardized testing is any test that is used across a variety of schools or other situations.
  • Standards-based assessment
    Standards-based assessment
    A standards based test is one based on the outcome-based education or performance-based education philosophy. Assessment is a key part of the standards reform movement. The first part is to set new, higher standards to be expected of every student. Then the curriculum must be aligned to the new...

  • Robert E. Stake
    Robert E. Stake
    Robert E. Stake is a Professor Emeritus of Education at the University of Illinois, Urbana-Champaign. Stake is a native of Adams, Nebraska...

    is an educational researcher in the field of curriculum assesments.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.