Evaluation of the effectiveness of progressive disclosure questions as an assessment tool for knowledge and skills in a problem based learning setting among third year medical students at The University of The West Indies, Trinidad and Tobago

Background At the University of the West Indies, Trinidad and Tobago, third year undergraduate teaching is a hybrid of problem-based learning (PBL) and didactic lectures. PBL discourages students from simply getting basic factual knowledge but encourages them to integrate these basic facts with clinical knowledge and skills. Recently progressive disclosure questions (PDQ) also known as modified essay questions (MEQs) were introduced as an assessment tool which is reported to be in keeping with the PBL philosophy. Objective To describe the effectiveness of the PDQ as an assessment tool in a course that integrates the sub-specialties of Anatomical Pathology, Chemical Pathology, Haematology, Immunology, Microbiology, Pharmacology and Public Health. Methods A descriptive analysis of examination questions in PDQs, and the students’ performance in these examinations was performed for the academic years 2011–2012, 2012–2013, and 2013–2014 in one-third year course that integrates Anatomical Pathology, Chemical Pathology, Haematology, Immunology, Microbiology, Pharmacology and Public Health. Results The PDQs reflected real life scenarios and were composed of questions of different levels of difficulty by Blooms’ Taxonomy, from basic recall through more difficult questions requiring analytical, interpretative and problem solving skills. The integrated PDQs in the years 2011–2012, 2012–2013, 2013–2014 respectively was 52.9, 52.5, 58 % simple recall of facts. By sub-specialty this ranged from 26.7 to 100 %, 18.8 to 70 %, and 23.1 to 100 % in the 3 years respectively. The rest required higher order cognitive skills. For some sub-specialties, students’ performance was better where the examination was mostly basic recall, and was poorer where there were more higher-order questions. The different sub-specialties had different percentages of contribution in the integrated examinations ranging from 4 % in Public health to 22.9 % in Anatomical Pathology. Conclusion The PDQ asked students questions in an integrated fashion in keeping with the PBL process. More care should be taken to ensure appropriate questions are included in the examinations to assess higher order cognitive skills. However in an integrated course, some sub-specialties may not have content requiring higher cognitive level questions in certain clinical cases. More care should be taken in choosing clinical cases that integrate all the sub-specialties.


Background
Problem based learning (PBL) is presently one of the most accepted modes of curriculum delivery in medical education [1]. It discourages students from simply getting basic factual knowledge, but encourages them to integrate basic and clinical knowledge and skills [2]. An important difficulty with PBL is coming up with assessment modalities that are in keeping with the PBL philosophy [1]. Assessment modalities should always match whatever teaching format or method of content delivery is used, as well as whatever competencies are being learnt or acquired [1].
Currently the multiple choice questions (MCQ) examination, generally, is a widely accepted assessment modality and has been used for many years. However, some researchers have expressed concerns about this mode of assessment in a PBL setting. While the MCQ format examines a broader component of the curriculum, Samy Azer's [2] concerns were that generally, standard MCQs assess factual or basic knowledge rather than deeper understanding of the content, or use of basic information. They often focus on the finer detail in textbooks, rather than the cognitive skills emphasized by the PBL philosophy. However other authors disagree as they say that well written MCQs do assess higher level cognitive skills, although creating these items does require more skill than the basic-recall type of questions [3][4][5]. On the other hand essay type of questions and free-response short answer questions (SAQ), while they may be easy to set, and do ask for deeper comprehension and analysis of content, they are time consuming for both staff and students, and they are usually associated with marking discrepancies and variations [4].
Because of such concerns, some schools have introduced extended matching questions (EMQ), others integrated clinical scenario (case cluster) MCQs which have been shown to test analytical skills, problem solving skills, cognitive and integration of knowledge [2,[5][6][7]. The Modified Essay Questions (MEQ) examination (also known as Progressive Disclosure Questions, PDQ) was developed as a compromise between the Multiple Choice Question and essay type of examinations [4]. The PDQ features an evolving case scenario, and thus tests the candidate's problem solving and reasoning ability, rather than mere factual recall; which is in keeping with the PBL philosophy [4]. Different experiences have been recorded by different authors regarding PDQs. Palmer et al. [8] said that while MEQs (PDQs) are easier to set than MCQs, they did show some discrepancies in marking among their examiners, (compared to MCQs), and they asked more lower-order Blooms Taxonomy cognitive level skills than MCQs. They also highlighted issues of "sampling" with MEQs (PDQs) whereas MCQs examined more content in the curriculum. They did note though that reliability was higher in longer compared to shorter examinations. Similarly, Moeen-uz-Zafar-Khan et al. [9] in their study on undergraduate medicine examinations, also concluded that well-constructed MCQs were superior to MEQs (PDQs) in testing higher order cognitive skills in a PBL setting. They showed that higher level cognitive skills (problem solving skills) questions in MEQs (PDQs) actually constituted only 40 % compared to 60 % in MCQs. On the other hand, they appreciated that MEQs (PDQs) force students to think and construct their own answers, and thus test their writing skills too, as opposed to MCQs, where students choose an answer from the possible options provided, which may sometimes just encourage students to "recognize" correct answers, rather than work through the information.
This paper analyzes the use of the newly introduced PDQs as an assessment method for a third year medical students' course in Para-clinical Sciences. Para-clinical Sciences bridge the gap between the pre-clinical and the clinical years. Students study, in an integrated way, the sub-specialties of Anatomical Pathology, Chemical Pathology, Haematology, Immunology, Microbiology, Pharmacology and Public Health (Table 1).
Teaching is systems based. It is a hybrid of didactic lectures and PBL which is more of the "Guided discovery" as opposed to "Open Discovery" approach. PBL problems are developed collectively by the department of Para-clinical Sciences staff with contributions from all the sub-specialties. Students generate their own learning objectives and tutors (facilitators) guide and ensure that the learning objectives given by the developers are covered completely. (These objectives are not made available to the students until they have done their own student-directed learning process. The tutor guides the students into coming up with the objectives they missed). This guidance is important since students develop learning objectives based on what they themselves think is skills. However in an integrated course, some sub-specialties may not have content requiring higher cognitive level questions in certain clinical cases. More care should be taken in choosing clinical cases that integrate all the sub-specialties. Keywords: Assessment, Integration, Progressive disclosure questions, Problem-based learning important [10] and may miss some important content. All examinations are set by the same developers based on the course content and learning objectives.
In the PDQ case scenario, clinical information is disclosed progressively, and questions asked at each stage of development by the different sub-specialties. Students are tested on their ability to explain and describe pathological processes, sequentially and logically solve a clinical problem, request investigations, interpret results of investigations, design therapeutic plans, predict side effects of management, suggest methods of preventing these side effects and their management should they occur: integrating all the sub-specialties: and all in keeping with the PBL process. Thus the PDQ should be a suitable assessment tool for PBL. The aim of this study was to describe the effectiveness of the PDQ as an assessment modality, in assessing knowledge and cognitive skills, among these third year medical students in the selected course. In the past, each sub-specialty examined the students independently of the others, in all assessments. Examination papers had separate sections for each subspecialty with no integration.

Methods
A descriptive analysis of examination questions in the PDQ examinations and the students' performance in these examinations was performed for the course Applied Para-clinical Sciences III (APS-III) for the academic years 2011-2012, 2012-2013, 2013-2014. Examination questions were assigned a difficulty level, (by the authors/researchers), based on the level of Bloom's Taxonomy [11] objectives that the questions required of the students. (Blooms' taxonomy was modified and assigned based on whether the students were being asked for basic recall of simple facts, e.g. Level I: where the instructional verb was "list/name", Level II: Recall of more difficult facts and comprehension where the instructional verb was: "Explain/describe" e.g. concepts, mechanisms, pathogenesis, Level III-Comprehension and Application of basic facts into the clinical scenario, Level IV: problem solving and interpretation e.g. sets of results or clinical presentation and suggest further investigations or management, etc.). Chi square (χ 2 ) test of equality for the percentage of questions in each level in the combined papers was used to see the significance of the differences in distribution across the four levels I, II, III and IV.

Results
The amount of examination content (percentage contribution) in the PDQ examinations varied by sub-specialty. This is reflected in the maximum possible scores per sub-specialty (Tables 2, 3 (Tables 2, 3 and  4). Questions were spread across all the four levels of Blooms' Taxonomy by subspecialty and for the overall combined integrated papers. For the combined integrated examinations, the questions consisted mostly of basic recall of simple facts i.e. 52.9, 52.5, 58 % respectively in the three years (Tables 2, 3 and 4). All the calculated χ 2 of equality for all three years for all four levels (I, II, III and IV) were significant at 0.01 level. By sub-specialty the Level I contribution ranged from 26.7 to 100 %, 18.8 to 70 %, and 23.1 to 100 % in the 3 years respectively.

Course content
Course teaching or delivery methods For some sub-specialties in 2011-2012 (Table 2) with higher level I questions (e.g. Microbiology: 85.7 %) and Chemical Pathology: 100 %, more students have a passing score, 99 and 92.6 % respectively. However in the same year Pharmacology had a higher percentage of level III (26.7 %) questions only 13.8 % of the students passed the Pharmacology component. In Anatomical Pathology 37.5 % of the questions were Level III and 12.5 % were level IV, and 25.6 % of the students passed Anatomical Pathology. The trend is different in Haematology, where 26.7 % were Level III, and 20 % Level IV, and 83.7 % of students passed the Haematology component.

Assessments
In the following year 2012-2013 (Table 3), Microbiology again had a higher Level 1 content (70 %) with higher percentage of students getting a passing score (97.4 %). In Pharmacology 37.5 % of the questions were Level III and 18.8 % were Level IV, yet 71 % of the students passed. In 2013-2014 (Table 4), 100 % of the questions in Public Health were Level I and 99.5 % of the students passed. In Pharmacology, with only 23.1 % level I questions, only 46.3 % of the students passed the pharmacology component. Figures 1, 2 and 3 show the percentage contributions of the different sub-specialties in terms of the four cognitive levels I, II, III and IV, graphically.

Discussion
With reference to MCQs, authors recommend a wide range of difficulties in examination questions, spreading across the ranges of easy, average, through difficult [3,12]. The Medical Council of Canada, 2010 [13], recommends a difficulty Index (p) of between 0.2 and 0.9. Kartik et al. [14] used a Difficulty Index (p) of <30 % or >70 %  as acceptable. In this study, the difficulty indices were not calculated, however the PDQ questions were spread through all levels of difficulty (by Blooms taxonomy Levels) from Level 1 to IV. However, a significant percentage of the examination content (52-58 %) required basic recall of simple facts: some sub-specialties more so than others. This is similar to what was shown by Palmer et al. [8] and Moeen-uz-Zafar-Khan et al. [9]. Well-constructed questions may be designed to test certain levels of Bloom's taxonomy in MEQs PDQs as is possible with MCQs. Some specialties do tend to stress on higher level cognitive skills. Moeen-uz-Zafar-Khan's team [9] showed that cardiology had mostly high cognitive level questions (mostly Level III Bloom's Taxonomy skills) in comparison to other medical specialties.
In this study some of the sub-specialties with more Level I questions showed better students' performance than those with higher level questions. More practice with higher order level questions is encouraged for all students, in all sub-specialties. Indeed in medical education, one major emphasis is to develop students' problem solving skills, since practicing doctors spend a great deal of time, assessing and solving patients' clinical problems [9].
In this study a possible contributing explanation for the high percentages in Level I questions, could be the fact that there was only one clinical case developed in each PDQ examination. Not all sub-specialties may have relevant objectives that require higher order objectives pertaining to the one case. This is actually similar to the learning objectives generated in the cases used in PBL process. Each sub-specialty initiates one PBL problem  which is then circulated for input from all other sub-specialties for development of content and learning objectives. Some sub-specialties may not have higher order objectives for some case scenarios. For example, one sub-specialty may ask students to simply list the risk factors of a certain condition, and in the progression of the PDQ, another sub-specialty may ask students to interpret a set of results, or ask about the mechanism of action of the drug used to treat the condition that is being discussed. Clearly this requires different levels of thinking in students. But it does reflect real life situations. This, though, also then raises the problem of sampling in PDQs (MEQs). With MCQs in comparison more course content can be tested. One possible solution in this setting, may be to have two cases being developed in the examination: although this would make the examination longer for the students to write, and for the teachers to mark, resulting in delayed feedback to the students. According to Palmer et al., in their study in 2010 [8], the reliability was higher with longer examinations when compared to short examinations. They showed that in a 3 h examination, the Cronbach alpha reliability was 0.84 for both MCQs and MEQs. Another possible contributing factor (for the high percentage of Level I questions), besides the different content, may indeed be that, in the integrated examination, some sub-specialties may just be more advanced in developing PDQ questions. PDQs must be properly constructed [4]. Construction of these questions, and their model answers, is not a simple task, and does indeed require expertise and training [9]. Indeed in assessments, other factors besides course content are important: including human resources, and time constraints. On the question of time, the time spent should be long enough for the assessment to be efficient, productive and to achieve its purposes [5].
Palmer's team [8] noted that there may be an underrepresentation of some sub-specialties and an over representation of others in an MEQ (PDQ) examination. This is true in this study too as shown by the different maximum possible scores in the different sub-specialties. This also speaks to the fact that some sub-specialties may not have relevant content and learning objectives for a given clinical scenario. Similar findings were shown by Moeenuz-Zafar-Khan's team [9] who showed a higher representation of cardiology compared to other medical specialties. If certain content is not being covered in the chosen PBL cases, curriculum developers have to make special effort to find cases that will cover all the relevant content [15].
In this study, no comparison of the newly introduced PDQ was made with the older assessment used (essays/ SAQ). Wilkinson et al. in 2004 [16] in a study comparing different old (essay type of examinations) and newer methods (PDQs, EMQs and MCQs), showed that the newer methods of undergraduate assessment predicted subsequent performance significantly better than older methods. In an earlier study the authors of this paper, however, showed high correlations between the PDQ and the final end of course examinations; [17] higher than with MCQs and EMQs.
The PDQ examination encourages reflection and analysis by students. Thus it may be used as a formative or summative method of assessment [10]. In this study the PDQ is used as in course assessment/continuos assessment (CA) (formative). As it is a CA, it should help direct students to study harder for the final examinations: as indeed it has been said that assessments should motivate students [18]. The timely feedback given should help to improve their knowledge and skills before the final examinations thus according to Diane Campbell, an assessment achieving its purpose [5].
The Royal College of General Practitioners [19] also agree that MEQs (PDQs) are a great source of learning or instructive experience since they are constructed from real clinical situations, they can even be used as a teaching method. In the department of Para-clinical sciences, where teaching and examinations integrate all the pathology sub-specialties, the PDQ gives a more complete picture of real life clinical cases. They require students to logically and systematically solve clinical problems which will be helpful when they become junior doctors. The PBL process, because real life cases are used, directs the students as to the challenges they will face as junior doctors, thus provides relevance and motivation for learning. The clinical cases give the students important points to focus on and help them realize how to integrate the loads of information from the many specialties. All of this is important for easier recall of information which is needed for application in real clinical problems [15]. Similarly the multispecialty, integrated PDQ requires students to apply relevant information and solidifies the focus on important points in clinical cases. Construction of MEQs (PDQs) can be difficult [9]. In their paper in 2011, Moeen-uz-Zafar-Khan and his team, showed 16 % of their MEQ questions to have "item-writing flaws" [9]. However, the PDQs are easier to construct than MCQs, and with clear marking schemes, the marking is easier than with essays/SAQs. In the department of Para-clinical Sciences, the same team that develops and agrees on the PDQ examination, reviews, agrees on and approves the marking schemes. Furthermore the marking is sub-specialty based (each sub-specialty marks only their section of the integrated examination), and the system of "table-marking" is used. This minimizes the "item writing flaws" and marking discrepancies. In some centres the marking discrepancies are minimized by having each examination script be reviewed by multiple markers. However, having for example double marking, is expensive in terms of the time needed by the examiners [20], and would indeed delay feedback to the students, in this setting.
The final examination in this third year setting is MCQ/ EMQ format ( Table 1). The combination of the three assesses both depth and breadth of the curriculum. Furthermore, it is believed that using different formats of assessment helps students as they may have different strengths in certain formats. A combination of different assessment modalities results in reliable and valid evaluation of students [9].

Limitations
Comparison between the students' performance in the assessments in the older modalities (essays/SAQ) compared to the PDQ, was not performed in this paper. However in an analysis of students' perceptions of the newly introduced PDQ [21] only 10.6 % of the surveyed students said that the PDQ was a poor method of assessment. About 85 % said it was fair/good/excellent (combined). 75 % of the students said that they would like the PDQ to remain as a continuos assessment/in course assessment (formative). 'made me think' and "Good way of assessing… Put the student in a hospital setting": were some of the selected students' comments that the authors reported.
The PDQ examination questions were analyzed in terms of Blooms Taxonomy levels of difficulty. However the "difficulty indices" were not calculated to determine if the higher Bloom's Taxonomy level questions were invariably more "difficult" for all types of students. However Tables 2, 3 and 4 show that for some of the sub-specialties, with higher percentages of the lower level questions the students' performance was better compared to when the examinations contain more higher-cognitive level questions.
-Other factors may indeed account for the different students' performances in the different sub-specialties. One question has been asked whether students in an integrated examination, choose to study and concentrate on some sub-specialties at the expense of the others [22].

Conclusions
Introduction of the PDQ examination presented an opportunity for an integrated multi-specialty assessment in the Para-clinical Sciences. The PDQ examinations consisted of questions of all levels of difficulty though the majority was Level 1. Better performance by students was seen in the lower cognitive level questions across sub-specialties. More questions of higher cognitive levels should be encouraged across sub-specialties. Perhaps more than one clinical case could be developed to ensure that all sub-specialties have a chance to have relevant content and to be able to develop higher order questions.
Assessments are useful for overall teaching-programme evaluation. This study also gives an opportunity to review PBL cases and learning objectives and assess them for levels of difficulty as per Bloom's taxonomy.