Teachers’ practices of non-standardised oral exams and implications for validity
Abstract
In educational assessment, the methods applied must ensure high validity (the feasible gathering and use of assessment outcomes), especially in situations where the stakes are high, and the assessment results have serious implications for students. However, all too little of the existing literature offers insights into teachers’ practices of non-standardised oral exams and how these practices might affect validity – a gap that has been recognised in both research and policy. Therefore, this study seeks to contribute new knowledge to the field of educational assessment through its analysis of a unique data sample consisting of video recordings of authentic oral exams conducted in four secondary schools in Norway. Each oral exam was carried out by two teachers: one internal examiner (the student’s teacher) and one external examiner (a practising teacher with no previous knowledge of the student). The current study was guided by an overarching research question: What are the characteristics of teachers’ practices of non-standardised oral exams, and how do the practices affect validity?
The research question is explored through three sub-studies, each with a different focal point involving various aspects of teachers’ oral exam practices and subsequent implications for validity. The first sub-study investigated the organisation of oral exams in terms of time use and the activities to which students and examiners orient. Also examined were the parts of the subject curriculum assessed and how different practices might impact validity and fairness. The focus of Sub-study II was on teachers’ questions during oral exams, including analysis of the themes and cognitive levels of the questions. In the final sub-study, the focal point was grading conversations in which co-assessors had to reach an agreement on what grades to award students.
This thesis follows a qualitative approach. Thematic analysis and content analysis were applied to video recordings and transcripts of 36 authentic oral exams in the subject of Norwegian Language and Literature in two lower and two upper secondary schools in Norway. The findings from the sub-studies reveal that the teachers’ practices varied extensively, for example, in relation to time use, what was assessed, and the number of questions posed. Other insights from the findings include that the oral exams comprised five clearly sequenced activity phases and most questions were asked by the internal examiner. When analysing these questions, themes were often easier to identify than cognitive levels. Furthermore, the verbs used to describe the learning outcomes in the curriculum were seldom included in the question formulations. Finally, the analysis of the conversations between the teachers co-assessing the exams demonstrated that oral exams may, as boundary objects, function across contexts despite variations in practices, and how students are assessed, thus allowing examiners who take different roles to come to a consensus regarding what grade to award each student.
Discussing findings drawing on the works of Messick (1989) and Crooks et al. (1996) raise several concerns about using grades from non-standardised oral exams for high-stakes purposes. Furthermore, the oral exams studied differed markedly from oral exams that are planned, practised, and validated based on principles of standardisation, implying that the validity chain (Crooks et al., 1996), which was developed within an assessment culture heavily inspired by principles of standardisation, does not include guidance for validating many of the characteristic features of non-standardised oral exams. However, considering the findings of this study in light of the quality criteria applied in qualitative research (Tracy, 2010) suggests the possibility of a supplementary perspective.
The findings from this thesis provide an important empirical contribution to the knowledge base and discourse on the validity of non-standardised oral exams in secondary education. Much of the existing research on oral assessments is based on self-reported data or quantitative methods. In contrast, the current study is qualitative and involves video recordings of authentic oral exams, thus addressing a long-acknowledged gap in the literature. Theoretically, this study contributes to the field of educational assessment by confirming that a unitary view of validity and the validity chain may be insufficient for investigating validity issues related to non-standardised assessments. The thesis also suggests applying a supplementary approach to validating non-standardised oral exams.
Stakeholders across educational fields are concerned with high-validity assessment methods; thus, these findings are essential for understanding and furthering the development of oral exam practices in secondary education in Norway. Lastly, the study’s relevance is bolstered by its discussion of oral assessment practices, national assessment systems, and validity-related issues beyond the Norwegian context as well as secondary education.
Has parts
Article 1: Syverud, Marte S. (under review). Oral exams in four Norwegian secondary schools – characteristics and variations in practice and possible threats to validity and fairness. Assessment in Education: Principles, Policy & Practice, minor revisions submitted April 30, 2024Article 2: Syverud, Marte S. (under review). Questions and tasks during non-standardised oral exams – construct, assessed domain, and possible implications for validity. Educational Assessment, Evaluations and Accountability, submitted July 14, 2023
Article 3: Syverud, Marte S. & Prøitz, Tine S. (under review). Co-assessment at the boundary – judges and advocates grading student’s performance in oral exams. Education Inquiry, submitted March 18, 2024