Skip to ContentSkip to Navigation
University of Groningenfounded in 1614  -  top 100 university
EDU Support Blackboard Instructor Assignments, Assessments & Exams How to assess your students' learning? Things to consider when developing your assessment method

Things to consider when developing your assessment method

Choosing an appropriate assessment method for your course is an important first step in the development of assessment. But how do you ensure that what you develop is of high quality after you have made that initial determination?

It may be especially helpful to keep four important quality criteria in mind: validity, reliability, transparency, and efficiency. You can think of them as guiding principles. Ask yourself during every step of the assessment process whether there is anything you can do to maximize the quality of your assessment method regarding these criteria.

Below, each criterion will be discussed (based on Downing & Haladyna, 1997), and provide some concrete tips that can help improve your assessment.


Validity concerns the extent to which you measure what you intend to measure. Ensuring a high degree of validity thus starts at the very foundation of your course, by clearly establishing what it is that you intend to measure. In other words, what are the intended learning outcomes of your course? 

Once you have established this, it is important that your assessment method(s) cover these learning outcomes (a concept called content validity). For example, if you have formulated four learning outcomes, then you do not want to create an assessment that merely targets one of the four. In addition to this, you also want to make sure that your chosen method can be used to assess the relevant levels of learning or skills (this is called construct validity). For example, if one of your learning outcomes states that students should be able to develop a research proposal by the end of the course (which pertains to their ability to create something), then you cannot assess this with an exam consisting of multiple choice-questions (which targets their ability to remember or apply information). Filling out a test grid can simplify the process of picking a suitable assessment method.

This same principle applies during later steps of the assessment cycle: your learning outcomes should be reflected in the concrete instructions, questions and grading criteria you develop. For example, the highest possible grade should indicate that a student has fully mastered the learning outcomes. Carefully developing grading criteria and rubrics can help to ensure that this is the case. A term we often use for this is constructive alignment. An overview of the principles of constructive alignment are well explained in our short video. 


Reliability concerns the degree to which chance plays a role in the determination of assessment results. Like validity, this is something that you can aim to maximize during various steps of the assessment cycle. For example, when developing exam questions, it is not just important that your questions pertain to your learning outcomes (which has to do with validity), but also that they are clearly formulated, free of grammatical errors, and unambiguous. Reliability is also a factor while an exam is taking place: you can minimize the role that chance factors play by ensuring that students have enough time to work on their exam, and allowing them to complete it in a calm environment.    

Lastly, reliability should be a major consideration during the grading process. For example, if multiple graders are grading an assignment, then it is important to make sure that they do so using the same standards. Developing clear, unambiguous grading criteria and/or rubrics, and discussing these ahead of time with all graders, can be of great value in this regard.


Transparency concerns the extent to which you provide students with information regarding the aim, design and content of assessment. While it has become standard practice to inform students about the learning outcomes of a course, detailed information regarding assessment methods is less frequently shared. Doing so has benefits however: when students know what is expected of them then this will increase their motivation and ability to reach the intended learning outcomes. For instance, it can be helpful to allow them to practice with the types of questions that you intend to include in your exam (this also helps improve reliability) and to inform them about the grading criteria for an assignment. Describing the utility of the assessment beyond the current course has also been shown to increase students’ motivation. For example, if your students are writing a research report, then it may be helpful to explain that this mirrors the work professionals in their field are doing, or that the skills they will develop when writing the report will be of value in future courses.


A final quality criterion is efficiency: the extent to which the invested time and resources are in proportion to the desired outcomes. While it is laudable to strive for pedagogically optimal assessment, this is simply not always feasible. For example, you may determine that the ideal assessment for your course would require students to submit an extensive written assignment. However, if over 500 students are enrolled in your course, then the capacity to grade all these assignments may simply not exist, or the the added pedagogical benefits that this assessment method would offer over another (e.g. an exam consisting of open-ended questions) may simply not outweigh the enormous investment of time and resources that would be required. It is thus important to make a realistic cost-benefit calculation.

Finding a balance

It is important to note that it can be difficult to maximize all four criteria. For example, a realistic assessment (i.e. an internship or simulation) scores high on validity, but a high level of reliability is harder to achieve because of the many factors that are out of your control.. The opposite can be said for a multiple choice exam. The reliability is high because most factors can be objectified, but the validity is limited to lower cognitive levels and the relevance is possibly less straightforward for a student. Therefore, it is important to find a balance across criteria that suits your assessment. Striking the right balance can be simplified by using a test grid. This tool will help you determine which learning outcomes and levels of learning you primarily want to focus on. You can subsequently choose an assessment method that aligns with this, and then try to enhance the quality of the four criteria.

Whom to contact?

Contact EDU Support or your faculty's Embedded Expert from ESI for tailored didactic advice in using these suggestions in your teaching. For technical assistance please contact Nestorsupport.


Downing, S. M., & Haladyna, T. M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education, 10(1), 61-82.

Last modified: 13 December 03:42 PM
In need of immediate support? 050 - 363 82 82
Follow Nestorsupportfacebook twitter youtube