Abstract
Massive Open Online Courses appear to have high attrition rates, involve students in peer-assessment with patriotic bias and promote education for already educated people. This paper suggests a formative assessment model which takes into consideration these issues. Specifically, this paper focuses on the assessment of open-format questions in Massive Open Online Courses. It describes the current assessment methods in Massive Open Online Courses and it argues that self-assessment should be the only way of formative assessment for the essays of xMOOCs and replace the peer-assessment.
Introduction: Assessment in Massive Open Online Courses
Massive Open Online Courses (MOOCs) have become increasingly popular over the last decade (Jordan, 2015). They are called ‘massive’, in relation to the number of registered students, and ‘open’, because the course content is free of charge. They can be accessed online and they are courses since they have a specific structure with a definite material to be studied (Siemens, 2013).
This paper focuses on the assessment in MOOCs. Assessment is an important topic to be examined because it can be a powerful learning tool even in the case of MOOCs. Boud and Falchikov (2007: 3) argued that ‘assessment, rather than teaching, has a major influence on students’ learning’. They supported that assessment has an impact on what learners do and how they do it, whilst it can help them understand what they can or cannot do. Additionally, assessments can have positive or negative ‘washback’, which means that they can have impact on students’ learning and motivation. Therefore, the relationship between assessment and learning should be examined (Baird et al., 2017), since assessments can both measure and support learning. Furthermore, the assessment can be considered a learning event. Testing what is learnt is more likely to lead to the retention of knowledge in memory compared to restudying the same material (Halamish and Bjork, 2011). Therefore, this paper focuses on the assessment as an important part of the learning process in MOOCs.
Since there are thousands of MOOCs, it is not possible to examine each course individually. This paper scrutinises the assessment for xMOOCs, which are MOOCs ‘offered in a traditional university model’ (Siemens, 2013: 7). Two popular examples of platforms offering xMOOCs are Coursera and EdX. There are xMOOCs which can be evaluated with multiple-choice items or computational activities. These activities can be automatically graded by a computer in a reliable way. However, for some MOOCs open-ended questions or essays are used as a preferable assessment method since it is debatable whether multiple-choice items evaluate particular high-order skills as effectively as open-response questions (Bennett et al., 1991; Hancock, 1994).
Undoubtedly, it is infeasible to involve tutors in assessing thousands of open-response assignments (Koller, 2012), since MOOCs are massive and hundreds to thousands of students can take each of these courses (Siemens, 2013). For this reason, EdX uses a system called Automated Essay Scoring (Balfour, 2013). This system has been criticised as being restricted to grading based on superficial factors, lacking the ability to point creativity, humour or sarcasm (Zhang, 2013) and not being able to assess the texts in the same way as human assessors (Balfour, 2013). Therefore, it is questionable to what extent computer-based assessment could evaluate the assignments of the students. Hence, it has been found that there is a need for human assessors.
Since it is infeasible to have the assignments of thousand students corrected by the university tutors, both Coursera (2016) and EdX (2016) use peer-assessment and self-assessment when a course demands a written assignment to evaluate the large number of assignments. Specifically, Coursera uses a calibrated peer review assessment system. In the calibrated peer review, the students initially mark some assignments which have been previously marked by the instructor. The system checks the agreement between the instructor and the student grade and it calibrates the grades given by the students. According to the extent of the agreement, the students are attributed a Reviewer Competency Index. Then, when the students grade the assignments of their peers, this grade can be considered more or less crucial for the final grade of their peer according to the Reviewer Competency Index (Balfour, 2013).
During peer-assessment, students grade the assignments of four or five of their peers using a rubric and they also have their assignment graded by the same number of peers. Some courses offer the option of self-assessment. If students self-assess their work from the course, they do it by using the same rubric after having completed the peer-assessment. Nevertheless, in EdX for the open-response assignments when students are both peer- and self-evaluated, they are awarded only the peer grade and the self-assessment is not taken into account (EdX, 2016). This implies that self-assessment is underestimated to some extent. Even though the feedback remains formative, self-assessment could contribute equally.
Assessments in Coursera and EdX can be either summative or formative. Assessment for Learning or formative assessment intends to support learning, while summative assessment is associated with grading, certifications and accountability (Gardner et al., 2010). When users are registered in a course in Coursera or EdX, they can choose between two routes. They can either choose the direction which will lead them to a certification or to simply audit the course. Thus, the summative assessment could be defined as the type of assessment which is focused on certification. In this case, learners are required to pay a fee to gain a certificate on both platforms. The option of auditing the course for free is also provided to the students. In this case, students can still complete the assessments to examine their level of knowledge, understand what they have to improve and promote their learning. This latter case is considered as formative assessment and it is the type of assessment on which this paper focuses.
Specifically, this paper focuses on formative assessment, because the second route corresponds to the needs of the majority of learners registered in MOOCs. Shrader et al. 2016 asked participants, who were registered in different courses, the reasons why they were taking these MOOCs. The study found that only a small percentage of participants (3.3%) were registered in MOOCs in order to gain a course certificate. The majority of the participants reported the broadening of knowledge (65.6%), and the curiosity and general interest for the topic (35.6%) as prime motivation to take a MOOC. Similarly, Salmon et al. (2017) found that the motivation of MOOC students is mostly intrinsic. Furthermore, Barak, Watted and Haick (2016) reported high ratings in intrinsic motivation of the MOOC learners in their study. Therefore, the majority of the students who participate in a MOOC appear not to pursue a certification, but they are merely interested in the learning of the MOOC content. Hence, it becomes apparent that formative assessment, which supports the students to evaluate their own learning, could be considered more meaningful for the majority of MOOC learners compared to the summative assessment, which is focused on the accreditation.
Finally, despite the high number of registered students, MOOCs appear to have high attrition rates and low engagement. Even though the success of the courses should not necessarily entail completion (Pursel et al., 2016), these are indicators that enhancement should take place concerning several aspects of MOOCs. This paper tries to address these issues by examining the best possible way of assessment for these courses. In the following sections, the reasons why peer-assessment in MOOCs should be abandoned and how self-assessment could possibly provide a solution for the improvement of the MOOCs experience is discussed.
Peer-assessment and interaction
Peer-assessment does not appear to be implemented in the most ideal conditions in the case of MOOCs. A meta-analysis with peer-assessment articles published in the last 15 years (Li et al., 2016) found that peer-assessment appears to be more correlated to the teacher assessment when it is paper- instead of computer-based. Moreover, the same meta-analysis showed that this correlation is higher when peer-assessment is voluntary. In the case of Coursera and EdX, the peer-assessment is computer-based and compulsory and its completion is pre-requisite for the students to have their own marks returned. Last but not least, peer-assessment is more accurate when the students participate in the creation of the rating criteria (Li et al., 2016). This is definitely not the case in the MOOCs when the scoring rubric is given to the students.
Peer-assessment in MOOCs can be argued to be beneficial for the students when they reflect on and evaluate the work of their peers (Comer and White, 2016). However, peer feedback is heavily criticised in MOOCs. Particularly, peer feedback in Coursera has been heavily criticised due to the fact that it is anonymous and it does not incorporate a check for plagiarism (McEwen, 2013). Further, peer feedback in Coursera has been accused of being inconsistent with a lack of feedback on the peer-assessment itself (Watters, 2012). Thus, the learners do not always appear satisfied by the peer feedback received. Concurrently, almost half of the participants in a large-scale survey supported that they put a lot of effort to evaluate their peers, but they feel that the peers do not comprehend their work (Kulkarni et al., 2013). Meanwhile, the peer-assessment is questioned concerning its trustworthiness by the students (Floratos et al., 2015).
As a result, the evidence for the use of peer-feedback in MOOC is not encouraging for the continuation of implementation of this assessment. Lee and Rofe (2016) found that the peer interaction in MOOCs does not play a role in the final grade of the students. Additionally, research with MOOC participants revealed negative correlation between the students’ performance and the extent to which they prioritise interaction with their peers (Phan et al., 2016). However, if peer-assessment is abandoned, it is likely that social interaction in MOOCs would be reduced. This might be problematic since the social interaction as a crucial factor for the learning process has been emphasised by educators, such as in the zone of proximal development of Vygotsky (1978). Nevertheless, the students who participate in MOOCs can still be offered the choice of peer interaction. This does not have to occur during the process of peer-assessment. The students could still interact through peer forums.
After demonstrating that the research findings do not fully support the implementation of peer-assessment and the benefits of peer interaction, the following section will demonstrate how self-assessment could potentially address crucial issues in MOOCs which peer-assessment fails to solve and address.
The current issues in MOOCs
As it has been discussed, the learners sometimes appear unsatisfied with the implementation of peer-assessment in MOOCs. There is a need to correspond better to the needs of the learners, and platforms providing MOOCs have still crucial matters to consider. Specifically, the issues that MOOCs face are high attrition rates, patriotic bias emerging in peer-assessment and the lack of providing accessibility to all the learners. This paper argues that self-assessment as an only assessment method which could correspond more effectively to each of the issues appearing in MOOCs and to the learning needs of the MOOC students compared to the current assessment method.
High attrition rate
One of the main problems in MOOCs is the high dropout rate. Jordan (2015) examined 221 MOOCs and she found that the completion rate varies from 0.7 to 52.1%. Specifically, the courses using peer grading or a combination of peer grading and autograding were completed by less than 10% of the students who were enrolled. On the other hand, courses with autograding were usually completed by more than 20% of students. Therefore, she concluded that courses with peer grading have high attrition rates. The use of autograding could be a solution, but not all the courses can be evaluated with autograding and multiple-choice items. Thus, the investigation of an assessment method to reduce the attrition rates for open-ended questions is crucial.
What is more, a recent survey identified (Nawrot and Doucet, 2014) that among 12 sample reasons more than 50% of the participants chose time as a reason for withdrawal. Specifically, by ‘time’ the authors meant ‘time organisation, real life responsibilities and too much time consuming course’ (Nawrot and Doucet, 2014). The participants argued that they lack time and hence they decided to quit. Subsequently, in order to reduce the attrition rates, MOOCs should implement less time-consuming assessment methods.
Time as a factor could also explain why Jordan (2015) found out that courses implementing peer-assessment are having the highest attrition rates. Peer-assessment can be extremely time-consuming as it demands from the students to grade at least four assignments of other students. Meek et al. (2016) used a survey in a MOOC to investigate the appropriateness of peer review. Their findings are in line with the argument that peer review is a time-consuming method. They found that women with full-time jobs were less likely to be involved in peer review process, while students retired by their jobs tend to complete the peer review the most. As a result, the authors also concluded that the time constraint can be a dispiriting factor for students to be involved in peer review process. Likewise, the combined peer- and self-assessment can be time-consuming. On the other hand, it takes less time for the students to grade only their own assignments. Consequently, self-assessment can correspond better to the needs of the MOOC learners as they appear to regard time as an important factor for their participation in the course.
Rating bias in peer-assessment
Bias is an important factor to be considered when the quality of an assessment framework is judged. A test is regarded as fair and unbiased when it does not attribute differences in different groups, when these differences do not exist, or do not attribute larger or smaller differences in the test than the differences which exist in reality (Hunter and Schmidt, 1976). Koretz (2008) argued that it is not the test itself which can be (un)biased, but a test inference. Assessment in MOOCs should not promote biased inferences. MOOCs are open to people from different social, economic and cultural backgrounds. Hence, MOOCs have immense diversity concerning their students. However, Kulkarni et al. (2013: 15) identified ‘patriotic grading’ which means that students tend to grade higher the students who come from the same country. Thus, peer-assessment in MOOCs appears to be biased. On the other hand, self-assessment can provide a solution to the rating bias found by MOOC research since it will leave no space for the patriotic bias. In research for self-assessment, students were found to agree that self-assessment is ‘fairer’ because it enables them to include complementary performance dimensions, such as effort (Ross et al., 1998). It is likely that self-assessment can reduce bias deriving from the diverse background of the participants in MOOCs. During self-assessment the students do not have the opportunity to promote bias regarding their peers, since they are only evaluating their work and they are aware of their own background, effort and learning goals.
Education not accessible to all
The most important element to be considered is probably who can access MOOCs. Even though MOOCs are promoted as a way of promoting ‘education for all’ and making the education accessible to everyone, research evidence has shown that the courses are usually taken by educated and employed people coming from developed countries (Christensen et al., 2013; Eichhorn and Matkin, 2016; Kulkarni et al., 2013). Furthermore, the MOOC participants who hold a master or a PhD degree are twice more likely to complete the course (Shrader et al., 2016).
This means that research, which examines the demographic characteristics of the MOOC participants, clearly demonstrates that the students who take MOOCs come from a particular educational and economic background. This might be due to the fact that some MOOCs are designed only for educated participants. There are examples of research which examined MOOCs specifically designed to be taken by students who have already some content background (Phan et al., 2016; Rieber, 2017).
Even though this is a discouraging finding for the accessibility of MOOCs to all learners, it reveals that there are particular students attending these courses. As a consequence, specific adjustments in the assessment model could correspond better to the needs of these students. The type of students who are mostly involved in MOOCs have high levels of self-regulation and the assessment model should acknowledge this element. Particularly, research disclosed that learners working as professionals in a field relevant with the MOOC content and students working towards a higher education degree have higher self-regulation levels (Hood et al., 2015; Kizilcec et al., 2017). Hence, formal education and prior knowledge are associated with higher self-regulation and performance in the course.
In addition, about 10% of participants in a recent survey (Shrader et al., 2016) reported refreshing and reviewing existing knowledge as the reason they registered in a MOOC. Thus, there is a group of MOOC students who take a course to expand upon existing knowledge. If the students have a basic previous knowledge related to the content of the course or they are familiar to the task, their self-assessment is more accurate (Boud and Falchikov, 1989; Fitzgerald et al., 2003). A recent study about online assessment verified that students who perform poorly overestimated their abilities, whilst the accuracy in self-assessment improved when the students increased their skills (Domínguez et al., 2016).
Concerning the participants of MOOCs, self-assessment can fit the model of their self-regulatory learning (Zimmerman et al., 1996) and therefore it can be the most appropriate assessment method for these self-regulated students, particularly when they are interested in course content and knowledge instead of obtaining a qualification. Nevertheless, the fact that students do not always perceive their self and their performance in an objective way cannot be disregarded. Bias may be introduced even in self-assessment based on the way the students perceive themselves. It is usually questionable to what extent students tend to be lenient and overrate themselves. Students sometimes regard themselves and their performance above average, underestimate the time needed to complete a task and have little insight of errors of omission (Dunning et al., 2004). Thus, self-image bias is likely to be present in the self-assessment procedure.
Self-image bias should not be deemed more problematic in a formative self-assessment than in peer-assessment context. It has been found that a large proportion of the students undertake an online course to satisfy their curiosity (Hew and Cheung, 2014).These students want to learn about the course topic, have an intrinsic motivation for the course, and as long as the assessment remains formative, they do not have a reason to cheat. Furthermore, it has been found that there are several ‘auditing’ students who watch the lectures that they do not complete the assessments (Kizilcec et al., 2013). This entails that these students are interested in the content of the course, but not interested in gaining a certificate or the assessment of a MOOC. Self-assessment will benefit their own learning and as they have intrinsic motivation to complete the assessment and evaluate their learning. They are also expected to be concerned about the real outcomes of their learning and try to assess accurately.
The fact that the students are intrinsically motivated does not exclude the possibility of them being incapable of recognising their own omissions. Nevertheless, peers should not be judged more capable assessors, because peer assessors can also fail to identify the mistakes in the assignments. For instance, Suen (2014) highlighted the likelihood of having peers grading in favour of some assignments based on their own misconceptions.
It is also possible that those students who are willing to keep a false self-image about their performance can still disregard the feedback of their peers. Several students who attend MOOCs support that their peers did not understand their work (Kulkarni et al., 2013), challenge the ability and the accuracy of their peers to assess their work (Floratos et al., 2015) and express the opinion that their peers are not qualified to provide feedback (Meek et al., 2016). Consequently, students can merely retain their self-image and be biased against the peer assessors. As an answer to this issue, it can be argued that it is more plausible for the course instructors to develop the self-regulation of students and try to limit the students’ self-image bias rather than persuading them that their peers are competent evaluators. Students with weaker metacognitive skills and self-regulation can be supported with scaffolding (Kizilcec et al., 2017).
Limitations
The arguments of this conceptual paper are based on available empirical research evidence. Therefore, in this section the limitations of the available published research evidence, on which the arguments of this paper are based, will be reported. The two main limitations of the research on MOOCs are the small response rate and the lack of representative sample for all the types of MOOC learners. This means that the research of MOOCs usually has small response rate compared to the overall number of students registered in the MOOCs (Lee and Rofe, 2016) and the research sample includes mainly educated participants (Loizzo and Ertmer, 2016; Salmon et al., 2017).
Pursel et al. (2016) recognised that their survey sample significantly differs from the general population of students registered to MOOCs and could be involved in the study. MOOC research projects, which are mostly surveys, manage to recruit as research participants these students who are already more engaged and involved in the course. For instance, Barba et al. (2016) examined the students’ motivation in participating in an eight-week MOOC. Their sample consisted of students who remained engaged in the course for the last three weeks. Considering the high attrition rates in the MOOCs, it is apparent that the majority of the students had already dropped out by that point. It is apparent that the motivation of the students who were still engaged during the last three weeks of a course cannot represent or explain the motivation of all the students initially registered in the course.
To summarise, since the response rate is low and the cases participating in MOOC survey cannot be representative for the whole population of the MOOC learners, the research findings cannot be generalised. Thus, the external validity of the currently available research studies is not well established. As a result, the argumentation of this conceptual paper which is based on the current research on MOOCs might have a partial insight into the topic.
Future research
This paper argued in favour of the self-assessment as the most effective and appropriate method of formative assessment when open-ended questions and essays in MOOCs are concerned. This conceptual paper can be followed up by the collection and analysis of empirical evidence. As it has been discussed, research has already been conducted to examine the perceptions of the learners about peer-assessment. Similarly, in future research, interviews can be conducted to investigate whether the learners in MOOCs perceive that self-assessment support their learning more than peer-assessment and whether they feel they can track their learning via their participation in self-assessment. Moreover, an evaluation of MOOCs which have adopted different approaches to open-ended assessments could provide with empirical evidence about potential causal relationships between the assessment methods used and learning occurred. Finally, since this paper argued that self-assessment can reduce the attrition in MOOCs, a study similar to the research conducted by Jordan (2015) could take place. Specifically, a study can be conducted in order to identify whether courses which use only self-assessment have higher completion rates compared to MOOCs which use auto-grading, peer-assessment or a combination of peer- and self-assessment.
Conclusions
This paper discussed the existing assessment system implemented for open-ended questions in MOOCs. The emphasis was put on formative assessment, because the majority of the MOOC students report to be more focused on learning and the course content instead of getting an accreditation when they are registered in a MOOC. Moreover, MOOC learners are usually educated, employed people from developed countries taking the courses with intrinsic motivation and tending to drop out when they lack the time to complete them. The reasons why peer-assessment is an inappropriate assessment method for these learners were explained. Instead, self-assessment was argued to be the most suitable assessment method to correspond to the needs of these self-regulated learners and a potential solution to the high attrition rates and the patriotic grading bias during peer-assessment.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
