Measurement, assessment, and evaluation mean very different things, and yet most of my students are unable to adequately explain the differences. So, in keeping with the ADPRIMA approach to explaining things in as straightforward and meaningful a way as possible, here are what I think are useful descriptions of these three fundamental terms.
Measurement refers to the process by which the attributes or dimensions of some physical object are determined. One exception seems to be in the use of the word measure in determining the IQ of a person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or preferences also applies. However, when we measure, we generally use some standard instrument to determine how big, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard instruments refer to instruments such as rulers, scales, thermometers, pressure gauges, etc. We measure to obtain information about what is. Such information may or may not be useful, depending on the accuracy of the instruments we use, and our skill at using them. There are few such instruments in the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a classroom is in terms of square feet, we measure the temperature of the room by using a thermometer, and we use Ohm meters to determine the voltage, amperage, and resistance in a circuit. In all of these examples, we are not assessing anything; we are simply collecting information relative to some established rule or standard . Assessment is therefore quite different from measurement, and has uses that suggest very different purposes. When used in a learning objective, the definition provided on the ADPRIMA for the behavioral verb measure is: To apply a standard scale or measuring device to an object, series of objects, events, or conditions, according to practices accepted by those who are skilled in the use of the device or scale.
Assessment is a process by which information is obtained relative to some known objective or goal. We assess at the end of a lesson or unit. We assess progress at the end of a school year through testing, and we assess verbal and quantitative skills through such instruments as the SAT and GRE. Whether implicit or explicit, assessment is most usefully connected to some goal or objective for which the assessment is designed. An assessment is another way of saying a test. A test or assessment yields information relative to an objective or goal. In that sense, we test or assess to determine whether or not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward. Either the skill exists at some acceptable level or it doesn’t. Skills are readily demonstrable. Assessment of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We can assess a person’s knowledge in a variety of ways, but there is always a leap, an inference that we make about what a person does in relation to what it signifies about what he knows. In the section on this site on behavioral verbs, to assess means To stipulate the conditions by which the behavior specified in an objective may be ascertained. Such stipulations are usually in the form of written descriptions.
Evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed to provide information that will help us make a judgment about a given situation. Generally, any evaluation process requires information about the situation in question. A situation is an umbrella term that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness, goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been made. For example, I often ask my students if they wanted to determine the temperature of the classroom they would need to get a thermometer and take several readings at different spots, and perhaps average the readings. That is simple measuring. The average temperature tells us nothing about whether or not it is appropriate for learning. In order to do that, students would have to be polled in some reliable and valid way. That polling process is what evaluation is all about. A classroom average temperature of 75 degrees is simply information. It is the context of the temperature for a particular purpose that provides the criteria for evaluation. A temperature of 75 degrees may not be very good for some students, while for others, it is ideal for learning. We evaluate every day. Teachers, in particular, are constantly evaluating students, and such evaluations are usually done in the context of comparisons between what was intended (learning, progress, behavior) and what was obtained. When used in a learning objective, the definition provided on the ADPRIMA site for the behavioral verb evaluate is: To classify objects, situations, people, conditions, etc., according to defined criteria of quality. Indication of quality must be given in the defined criteria of each class category. Evaluation differs from general classification only in this respect.
To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of criteria. These three terms are certainly connected, but it is useful to think of them as separate but connected ideas and processes.
Here is a great link that offer different ideas about these three terms, with well-written explanations. Unfortunately, most information on the Internet concerning this topic amounts to little more than advertisements for services.
Assessment, measurement, research, and evaluation are part of the processes of science and issues related to each topic often overlap. Assessment refers to the collection of data to describe or better understand an issue, measurement is the process of quantifying assessment data, research refers to the use of data for the purpose of describing, predicting, and controlling as a means toward better understanding the phenomena under consideration, and evaluation refers to the comparison of data to a standard for the purpose of judging worth or quality. Assessment and/or measurement are done with respect to variables (phenomena that can take on more than one value or level). For example, the variable "gender" has the values or levels of male and female and data could be collected relative to this variable. Data on variables are normally collected by one or more of four methods: paper/pencil, systematic observation, participant observation, and clinical. Three types of research studies are normally performed: descriptive, correlational, and experimental.
Collecting data (assessment), quantifying that data (measurement), making judgments (evaluation), and developing understanding about the data (research) always raise issues of reliability and validity. Reliabilityvalidity focuses on accuracy or truth. The relationship between reliability and validity can be confusing because measurements (e.g., scores on tests, recorded statements about classroom behavior) can be reliable (consistent) without being valid (accurate or true). However, the reverse is not true: measurements cannot be valid without being reliable. attempts to answer concerns about the consistency of the information (data) collected, while
The same statement applies to findings from research studies. Findings may be reliable (consistent across studies), but not valid (accurate or true statements about relationships among "variables"), but findings may not be valid if they are not reliable. At a miniumum, for an instrument to be reliable a consistent set of data must be produced each time it is used; for a research study to be reliable it should produce consistent results each time it is performed.
ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH
Bill Huitt, John Hummel, and Dan Kaeck
Department of Psychology, Counseling & Guidance
Valdosta State University
Wilber, K. (1998). The marriage of sense and soul: Integrating science and religion. New York: Random House
KNOWLEDGE ASSESSMENT (EVALUATION) THEORY
KNOWLEDGE (skills or attitudes) ASSESSMENT = systematic examination procedure by testing and with the goal to establish desired characteristics and gathering proof about the level and quality of the acquired knowledge (skills or attitudes).
The required characteristics (of knowledge, skills or attitudes) are defined by instructional goals, so that the type of examination depends on educational goals. Goals clearly and concretely define what knowledge and which skills or attitudes a student should have at the end of the instructional process. Therefore, the selection of goals is the most important decision made by experts when planning a curriculum for a certain educational profile, and the teachers should follow it!The content and range is chose from three areas (see Bloom's Taxonomy):
1) cognitive area of knowledge and understanding - knowledge is defined as a systematic overview of acquired and permanently memorized facts - cognitive knowledge is defined as knowledge related to mental ability or function.2) affective area of attitudes 3) psycho-motor area of skills
|10 Golden Rules for Writing Multiple Choice Questions|
In a classical multiple choice question a student should choose a correct answer among several (optimally 5) answers.
Multiple choice questions consist of three obligatory parts:
Writing a good exam question with multiple answers is a skill that usually comes with experience (often bitter :-) ). Feedback gathered through analysis of student answers ("item analysis") is very important for the authors of the test. There are several rules we can follow to improve the quality of this type of written examination.
1. Examine only the important facts!
2. Use simple language!
3. Make the questions brief and clear!
4. Form the questions correctly!
5. Take into consideration the independence of questions!
6. Offer uniform answers!
7. Avoid asking negative questions!
8. Avoid distracters in the form of "All the answers are correct" or "None of the answers is correct"!
9. Distracters must be significantly different from the right answer (key)!
10. Offer an appropriate number of distracters!
Writing and taking MCTs
Testing & Assessment
Types of written questions
Multiple choice question - MCQ
Multiple response questions
True / Fals question)
Fill in the blank
There are two types of essays:
What Does Research Say About Assessment?
Purposes of Assessment
Effects of Traditional Tests
Jill Kerper Mora, Ed.D.
San Diego State University
Characteristics of Good Assessment
Trends Stemming from the Behavioral to Cognitive ShiftChecklist for Excellence in Assessment And glossary
Thondike and Hagen (1986) define measurement as "the process of quantifying observations [or descriptions] about a quality or attribute of a thing or person" (p.5).
The process of measurement involves three steps:
Methods of data collection
Data are generally collected through one or more of the following methods:
Evaluation includes the process of making judgments about the value of data collected through observations and descriptions. It is closely related to the concept of assessment, which is defined as "the process of collecting, interpreting, and synthesizing information in order to make decisions" (Gage & Berliner, 1991, p. 568). It is generally agreed that it is better to base judging and decision making on quantitative data as much as possible.
There are a variety of issues related to measurement and evaluation that are relevant to classroom and school settings.
In general a rubric is a scoring guide used in subjective assessments. A rubric implies that a rule defining the criteria of an assessment system is followed in evaluation. A rubric can be an explicit description of performance characteristics corresponding to a point on a rating scale. A scoring rubric makes explicit expected qualities of performance on a rating scale or the definition of a single scoring point on a scaleRubrics are explicit schemes for classifying products or behaviors into categories that vary along a continuum. They can be used to classify virtually any product or behavior, such as essays, research reports, portfolios, works of art, recitals, oral presentations, performances, and group activities. Judgments can be self-assessments by students; or judgments can be made by others, such as faculty, other students, or field-work supervisors. Rubrics can be used to provide formative feedback to students, to grade students, and/or to assess programs.
Rubrics have many strengths:
This Week's Featured Rubric
Rubrics by Term
|Teachers TeAch-nology Top Sites for English Learners Mark's ESL World|
|Top 100 Teacher Sites Welcome to ITESL TESOLMAX English Club|
|HCC Assessment Website||Information on HCC assessment endeavors.|
|Classroom Assessment Techniques||Techniques for better teaching and learning.|
|Classroom Assessment Examples||Five examples from the previous article.|
|Quizzes, Tests, and Exams||Types, Bloom bases, guidelines, construction.|
|Assessment is More than Keeping Score||Moving from inquiry, through interpretatin, to action.|
|Test Item Bias Review||When decisions are made based on test scores, it is critical to avoid bias.|
|The Knowledge Survey||A Tool for All Reasons.|
|Portfolio Assessment||Using a collection of student work representing a selection of performance.|
|Student Passports||A formal document presenting student mastery of skills.|
|A Mid-Semester Survey||Use this simple survey to get feedback from your students.|
BOOKS on ASSESSMENT