Abstract
Higher education institutions have access to higher volumes and a greater variety and granularity of student data, often in real-time, than ever before. As such, the collection, analysis and use of student data are increasingly crucial in operational and strategic planning, and in delivering appropriate and effective learning experiences to students. Student data – not only in what data is (not) collected, but also how the data is framed and used – has material and discursive effects, both permanent and fleeting. We have to critically engage claims that artificial intelligence and the ever expansive/expanding systems of algorithmic decision-making provide speedy, accessible, revealing, panoramic, prophetic and smart analyses of students' risks, potential and learning needs. We need to pry open the black boxes higher education institutions (and increasingly venture capital and learning management system providers) use to admit, steer, predict and prescribe students’ learning journeys.
This article is a part of special theme on The Black Box Society. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/revisitingtheblackboxsociety
Introduction
In this brief commentary, I map the social imaginary (and its impact) pertaining to algorithmic decision-making in the context of the measurement, collection, analysis and use of student data, a practice known as learning analytics, in higher education. This imaginary allows ‘data themselves to come to life and begin to have consequences when they are analysed and when those analyses are integrated into social, governmental and organisational structures’ (Beer, 2019: 14). The data imaginary and gaze include not only those who have the power to define, collect, analyse and use data, but also those whose data is collected, analysed and used. As such, this ‘data gaze’ in both ‘its material and discursive configurations’ (Beer, 2019: 7) affects us all – whether as the one who gazes at or the one who is gazed upon. In higher education in particular, this data gaze functions as a ‘one-way mirror’ (Pasquale, 2015: 9) and students (and increasingly faculty) have no way to escape this gaze, or at best, return the gaze. The collection, analyses and use of student data furthermore resembles an ‘algocracy’ described as the normalisation and naturalisation of governance by algorithms that exclude, per se, any disagreement or user interrogation, or other discursive or material options (Aneesh, 2006, 2009; see also Danaher, 2016a, 2016b). For students, there is ‘no exit’ (Pasquale, 2015: 52) and surveillance-as-service is an integral part of terms and conditions (Khalil et al., 2018).
Higher education can be classified as a ‘data frontier’ (Beer, 2019: 19), as a data-unchartered territory where data-led decision-making is still in its relative infancy, and where ‘those invested in this industry are inevitably seeking to push these boundaries back’ (Beer, 2019: 19). The notion of the ‘frontier’ is fraught with historical narratives of conquest, colonisation and subjugation (Gouge, 2007; Kwet, 2018), and, as such, adds a specific heuristic lens to understanding the datafication of learning. As ‘the frontier’ institutions have access to higher volumes and a greater variety and granularity of student data, often in real-time, than ever before (Slade et al., 2019). As such, higher education is part of the ‘data revolution’ and ‘data deluge’ (Kitchin, 2014) with many believing that more or big(ger) data is necessarily better data (Prinsloo et al., 2015). Data, whether small or big, is like technology ‘neither good nor bad; nor is it neutral … technology’s interaction with the social ecology is such that technical developments frequently have environmental, social, and human consequences that go far beyond the immediate purposes of the technical devices and practices themselves' (Kranzberg, 1986: 545). As Pasquale (2015) states: ‘ … we should never lose sight of the fact that the numbers on their computer terminals have real effects, deciding who gets funded and found, and who is left discredited or obscure’ (215; see also Eynon, 2013).
A growing part of the data imaginary of higher education is ‘authority … increasingly expressed algorithmically’ (Pasquale, 2015: 8), with the power ‘to include, exclude, and rank to ensure that certain public impressions become permanent, while others remain fleeting’ (Pasqual, 2015: 14; see also Prinsloo, 2017, 2019). Data and data scientists in particular ‘must recognize themselves as political actors engaged in normative constructions of society and, as befits political work, evaluate their work according to its downstream material impacts on people’s lives’ (Green, 2018: 1).
In this brief commentary, I map the social imaginary (and its impact) pertaining to algorithmic decision-making in the context of the measurement, collection, analysis and use of student data, a practice known as learning analytics, in higher education.
The lure of faster, smarter and more effective
Amid the various issues and often dramatic changes facing higher education (e.g., Altbach et al. (2009), higher education increasingly finds its security in the mantra of evidence-based decision-making (Gagliardi et al., 2018; Prinsloo, 2016), ‘performativities and fabrications’ (Ball, 2004: 143), managerialism (Lynch et al., 2015) and new public management (Bleiklie, 2018). The culture of performativity is a ‘technology, a culture and a mode of regulation, or even a system of “terror” … , that employs judgements, comparisons, and displays of means of control, attrition, and change’ (Ball, 2004: 144). These quantifications and metrics facilitate the making and remaking of judgements about us, the judgements we make of ourselves and the consequences of those judgements as they are felt and experienced in our lives. We play with metrics and we are more often played by them. (Beer, 2016: 3)
Algorithmic decision-making in higher education functions as a ‘prosthetic vision’ (Beer, 2019: 7) allowing automated insights into data that were not only not possible before, but also not accessible for those without the necessary training and knowledge. As such the data gaze presents itself as ‘speedy, accessible, revealing, panoramic, prophetic and smart’ (Beer, 2019: 22; emphasis in the original). In order to realise these six characteristics, the human element in relation to and involvement in algorithmic decision-making systems is increasingly taken out of the equation (Knox, 2010: 211; Prinsloo, 2017, 2019). Danaher (2015) developed a tentative model to classify algocratic decision-procedures using the four tasks of sensing, processing, acting and learning as basis for mapping. Algocratic modes of decision-making or algocratic systems refer to operational or governance systems where algorithms act or ‘rule’ with or without a range of human supervision or engagement – from where human actors are supported by algorithms, to humans sharing tasks with algorithms, to algorithms performing tasks with human supervision, to automated independent algorithms (Danaher, 2015). Realising the characteristics of ‘speedy, accessible, revealing, panoramic, prophetic and smart’ however inherently points to the exclusion of human participation in the ‘black box’.
The ‘black box’ of algorithmic decision-making or algocracy delivers speedy, real-time information based on real-time analysis, on a continuous basis, closing the ‘gap between data and knowledge’ (Beer, 2019: 23). As such it makes us ‘addicted to speed’ (Pasquale, 2015: 195). Software provides the analysis and as such, the analysis is accessible to anyone – even those with no experience, analytical or technical expertise. Circumventing the subjectivity of human judgement, algorithmic decision-making reveals that which is hidden to the naked eye and as such acts as séance (Prinsloo, 2016) and panoramic, acting as ‘prosthetic eye’, all-seeing and omnipotent (Beer, 2019: 25). The ‘black box’ also has the ability to ‘grasp the future and use it in the present’ (Beer, 2019: 27) and fulfils a prophetic role, opening up new possibilities. As such algorithmic decision-making systems produce and enable futures and outcomes in smart and responsive ways. Self-learning algorithms increasingly also act autonomously, without human involvement./supervision. Actually, to fully realise ‘speedy, accessible, revealing, panoramic, prophetic and smart’ (Beer, 2019: 22), humans and human oversight and supervision should be removed from the equation. These six characteristics also require data to be increasingly flattened, faster than before and individuals fitted into historical categories of race, household, income and marital status while there is increasing evidence that these categories may, in fact, be part of a ‘zombie sociology’ (Beck, 2001) and act as ‘zombie categories’ (Gullion, 2018: 68). For example, when we assume ‘household’ as category, what do we mean, include and exclude?
So where does this leave the collection, analysis and use of student data by algorithmic decision-making systems?
Student data and algorithmic decision-making
When ‘The black box society’ (Pasquale, 2015) was published in 2015, learning analytics as research field and practice was already four years old. At the inaugural learning analytics conference in 2011 in Banff, Canada, the Conference Call made mention of the reality that the ‘growth of data surpasses the ability of organizations to make sense of it’ (Siemens, 2010). The Conference Call also states that ‘[l]earning institutions and corporations make little use of the data learners “throw off” [sic] in the process of accessing learning materials, interacting with educators and peers, and creating new content’ (Siemens, 2010). Among the reasons necessitating a more intentional and integrated collection, analysis and use of student data, the organisers mention, inter alia, pressures ‘to reduce costs and increase efficiency’ as well as ‘increased competitiveness and productivity’ (ibid.). While algorithmic decision-making, artificial intelligence (AI), machine learning and neural networks are not mentioned per se, the organisers do mention ‘[a]dvances in knowledge modelling and representation, the semantic web, data mining, analytics, and open data form a foundation for new models of knowledge development and analysis’ (ibid.). The Conference Call concludes with proposing the first seminal definition of learning analytics as ‘the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs’ (ibid.).
In the same year of the learning analytics inaugural conference (LAK’11), the number of scholarly articles mentioning ‘learning analytics’ in combination with ‘artificial intelligence’ equals 46 results (excluding patents and citations). For example, Elias (2011) mentions that the ‘University of Phoenix and Capella University consistently make extensive use of artificial intelligence and predictive modelling in marketing, recruitment, and retention and have shaped their cultures around performance’ (16). In referring to the tool set available to learning analytics, Elias mentions ‘data visualization, decision trees, neural networks, regression analysis, machine learning, and artificial intelligence’ (12). Also in 2011, Blikstein (2011) mentions two examples where AI was used, one in a computer-based application where a machine learning algorithm allowed them to ‘to most accurately classify each student in terms of their content knowledge, based on comparisons with expert-formulated responses’ (110). The other example refers to identifying student sentiment through studying and coding students’ facial expressions and applying machine learning to the data produced. Of specific interest is the value added by AI techniques and the following are mentioned: ‘1) detailed, multimodal student activity data (gestures, sketches, actions) as a primary component of analysis, 2) automation of data capture and analysis, 3) multidimensional data collection and analysis’ (111). In broad terms, these may fall in the category of surveillance and analysis as discussed above. Outside of algorithmic decision-making as surveillance, there is also evidence of algorithmic decision-making acting autonomously. For example, Verbert et al. (2011) report on small-scale experiments with recommender systems and state that while such experiments offer valuable insights into the usefulness and relevancy of recommender systems for learning, stronger conclusions about the validity and generalizability of scientific experiments could be drawn if researchers have the possibility of verification, repeatability, and comparisons of results based on large datasets that capture learner interactions in real setting. (44)
Avoiding Frankenstein and Kafka
While there is evidence of the huge potential of algorithmic decision-making systems to solve problems or to contribute to human flourishing, there are equally very real concerns about systems. Consider for a moment using AI to identify and inform cancer treatment (Coccia, 2020), forecasting the progression of the COVID-19 in China (Hu et al., 2020) and concerns with regarding the roll out of AI-driven surveillance systems to identify and control individuals’ movement and risk (Kupferschmidt and Cohen, 2020), and the persistence of racial bias in AI systems (Allen, 2019). Amid the many concerns and uncertainties about the various aspects of algorithmic decision-making systems, likening these algorithmic decision-making systems to the monster created by Dr Frankenstein comes effortlessly as we have to recognise AI and algorithmic decision-making systems as the result of human endeavour, and as such, these systems are our creation (Latour, 2012; Prinsloo, 2017). We have to face our creature, and increasingly our creature eludes our gaze and hides in ‘black boxes’, and worse still, turns their gaze upon us. Algorithms and algorithmic decision-making systems are ‘not just abstract computational processes; they also have the power to enact material realities by shaping social life to various degrees’ (Butcher, 2017: 40) and individuals have, increasingly, very little control. I have a sense that we are condemned to an ‘ouroborotic Kafkaesque journey where the tensions between the dangers and the potentials of algorithmic decision-making are never (really) resolved’ (Prinsloo, 2017: 144). These tensions are furthermore visible in research on using algorithmic decision-making systems to determine who gets accepted into institutions, who has access to student funding and support, and which students will not be ‘worth’ the investment of additional support to retain them in a program (Johnson, 2017; Jones and McCoy, 2019; Prinsloo and Slade, 2014, 2017).
In the specific context of students’ engagement with an institution in the context of their learning, we have to critically consider the scope and function of automation, how visible is the algorithm and its assumptions and functions, how engaged is the data-object with the collection and analysis of data, the scope and function of these data assemblages, the temporal frame of collection, analysis and use, etc. (see Knox, 2010; Prinsloo, 2017). This aligns with Pasquale’s (2015) suggestion of ‘watching (and improving) the watchers’ (140). We need to pry open the black boxes higher education institutions (and increasingly venture capital and learning management system providers) use to admit, steer, predict and prescribe students’ learning journeys. Students have a right not only of access, but also to provide additional information and correct institutions’ assumptions and categorisations of them (Slade and Prinsloo, 2013).
There is evidence in the learning analytics community of scholars, researchers and practitioners, of an increasing recognition of and sensitivity towards the many issues in the context of algorithmic decision-making in learning analytics. For example, Shum (2017) acknowledges that learning analytics functions as a ‘black box’ and he proposes an accountability analysis that focuses on system integrity from various perspectives and that pushes thinking beyond many of the current thinking about the ethical issues in learning analytics. Chen and Zhu (2019) propose Value Sensitive Design as a ‘systemic approach with specific strategies and methods to help researchers and designers explicitly incorporate the consideration of human values into design’ (344). Their research foreground the tension between utility and the ‘values of students’ autonomy, privacy, social well-being and self-image’ (347; emphasis in the original). Other values include freedom from bias, fairness, epistemic agency and ease of information seeking (Chen and Zhu, 2019).
To what extent these steps will make a difference in the context of the commercial value embedded in the ‘data gaze’ (Beer, 2019) and ‘black boxes’ in service of the ‘lords of the information age’ entrenching ‘a digital aristocracy’ (Pasquale, 2015: 218) will have to be seen. Up to then, we are called to face and dialogue with and care for our monster (Latour, 2012).
Conclusion
Despite a greater sensitivity to the ethical issues surrounding the ‘black box’ of collection, analysis and use of student data (Slade and Prinsloo, 2013), and practice-oriented proposals for algorithmic accountability (Shum, 2017), and values (Chen and Zhu, 2019), these proposals, frameworks and methods may not have resolved ‘the normative questions of what impacts are desirable and how to negotiate between conflicting perspectives (nor the practical question of how to leverage technology toward these ends)’ (Green, 2018: 6). And though codes of ethics, accountability and transparency may assist in addressing the issues arising from the ‘black box’ of the collection, analysis and use of student data, central to (re)considering Pasquale’s (2015) ‘The black box society’ is the recognition of the inherently political nature of data. Data is not raw and/or objective but is entangled in ‘specific arrangements and interests in the nexus of economic, political, social, technological, environmental and legal apparatuses, structures, and elements’ and realise, increasingly autonomously ‘ideological and political ontologies and epistemologies’ (Prinsloo, 2019: 3).
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
