Abstract
Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.
Keywords
Machine learning (ML) algorithms are widely used to filter, sort, and classify information. They find application in a variety of areas, such as recommender systems and search engines, news feeds and targeted advertising, as well as in bank lending, predictive policing, healthcare, and insurance companies. In computer and data science—and particularly in the ACM FAccT community 1 —there is growing awareness of such algorithms being biased (Barocas and Selbst, 2016; Bolukbasi et al., 2016; Buolamwini and Gebru, 2018) as well as a clear trend towards fair, accountable, and transparent AI systems (Hutchinson et al., 2021; Selbst et al., 2019). Critical media and data studies have added to this awareness by offering societal critiques of the discriminatory effects of algorithms and their role in the proliferation of echo chambers (Apprich et al., 2019; Eubanks, 2017; Noble, 2018; O'Neil, 2016; West et al., 2019). However, in daily practice a tendency to technological solutionism often prevails (Laufer et al., 2022: 409–410).
Following Rieder (2020), one can say that most media and critical data studies (CDS) research on the societal implications of ML either takes specific algorithmic systems in focus while disregarding wider epistemological concepts, or remains at a level of societal critique without scrutinizing technical specifications. Another dominant trope is that algorithms are often not accessible for research, as they appear black-boxed and opaque, or cannot be grasped in detail due to proprietary limitations (Pasquale, 2015: 6–8). Instead, Rieder (2020: 13) proposes to focus on “algorithmic techniques,” which are “the defined yet-malleable units of technicity and knowledge developers draw on when designing the function and behavior of computers acting in and on the world.” According to him, this allows to “more specifically […] target a space that is somewhat abstracted from the level of concrete algorithms” (Rieder, 2020: 99). Following this approach, we shift the focus from specific algorithmic systems and their instantiations to the fundamental conceptualization and representation of ML algorithms in development practice and computer science education. More precisely, we look at how ML is represented and conceptualized in educational material provided by Big Tech companies that aims at the understanding and implementation of core ML concepts. Previous research has already studied ML infrastructures surrounding data practices (Mackenzie, 2015), for instance focusing on the developer platform GitHub (Burkhardt, 2019; Mackenzie, 2018). The field of education, however, is particularly relevant as it touches upon a political-economic perspective. It enables us to trace what it means when Big Tech actors engage in the education of future ML practitioners, dominating not only infrastructures but also discursively shaping ML research and development.
By studying introductory online ML courses, we provide a novel perspective on ML as a field of practice and as a discipline. ML courses attract a large audience in times when data scientists are desperately needed. They teach a uniform corpus of standard ML techniques (Mackenzie, 2017: 29), which allows for gaining insight into the core concepts, infrastructures, and epistemological foundations of the field. With the mediation of knowledge comes a wide array of discursive elements that enables a critical lens as to how ML is represented and understood. This results in a better understanding of the assumptions and beliefs that prominently guide data classification today. Last, by concentrating primarily on large and dominant technology companies and shedding light on their impact as educators, our research reveals how the transmitted knowledge can be connected back to the power relations in which it emerges. Taken together, we seek to investigate how ML is represented and conceptualized in two of Big Tech's online ML courses. Moreover, by critically reflecting why it is these discursive elements that are currently dominant in data classification and how these manifest the companies’ position of power, we show how the prevailing ML discourse is dominated by the economic interests of the Big Tech companies that provide these online courses.
Given the companies’ forefront position in the domain of ML—and thus their immense influence on its current research and development—we focus on Google's Machine Learning Crash Course (MLCC) and IBM's introductory course to Machine Learning with Python. With the open-access offer of its ML technologies and educational material on the one hand, and its economic endeavors on the other, Google presents an interesting object of analysis. The company profits immensely by open-sourcing its latest technologies—fueling ML research and thus further improving Google's products. IBM, too, has evident economic interest in the advancement of ML research and development as they provide AI infrastructures for businesses. Contrary to Google, however, the company offers its course on the Massive Open Online Course (MOOC) platform Coursera. This is explicitly career-oriented and focused on the training of future ML practitioners. IBM's platform Watson Studio as well as IBM Cloud are only part of a limited free trial and must be purchased long-term. Thus, a comparison of the two courses allows tracing the possible differences in the economic strategies of the two providers and the extent to which these affect course design.
In the following, we will first present the state of research on the study of algorithms in media and CDS. Subsequently, we show how an analysis of online courses adds to our understanding of ML algorithms and their societal consequences. After introducing the data selection, we will discuss our methodological approach. Online courses are so far rarely studied in research on (ML) algorithms—hence, we adapt the walkthrough method (Light et al., 2018) that has been developed for the analysis of digital platforms. For a further in-depth analysis of the narratives and references found within the courses, we will conduct a critical discourse analysis (CDA). The second part of the paper revolves around the results of the analysis, first addressing the core components of the online courses, and then examining revealed narratives and directives. Ultimately, we point out the importance of considering the implications the discursive dominance of Big Tech companies has for current ML research and development within the broader AI industry.
State of research: How to study (ML) algorithms?
The emerging interdisciplinary field of CDS investigates processes of the capture and analysis of big data, challenging the assumption that large masses of data can be regarded as objective truth (e.g. boyd and Crawford, 2012; Iliadis and Russo, 2016). It offers the analytical framework to place processes of big data within their historical development, questioning their neutrality and acknowledging big data's cultural influences and contingent role. This allows for challenging the power dynamics behind its production (Dalton and Thatcher, 2014). Scholars studying algorithms have adapted this approach (e.g. Bucher, 2018; Gillespie, 2014; Seaver, 2013) and have pointed to their discriminatory effects, conducting ethnographic studies (Eubanks, 2017; Noble, 2018), critically discussing technological conditions (Apprich et al., 2019; Bechmann and Bowker, 2019; O'Neil, 2016) and material dimensions (Crawford, 2021), and analyzing the workforce behind AI (Svensson, 2021; West et al., 2019). Other research focuses specifically on ML algorithms and their implications (e.g. Amoore, 2020; Apprich, 2018; Burrell, 2016; Engemann and Sudmann, 2018; Mackenzie, 2015, 2017).
Research addressed at ML concepts more broadly has been conducted by Chun (2019, 2021), who shows how the homophily principle—and thus the assumption that similarity breeds connection—is deeply influencing ML classification algorithms, applied for instance in filter systems. More elaborate work on ML and classification algorithms has been conducted by Mackenzie (2017) who traces specific data practices and exposes the forms of knowledge and power they establish, and Rieder (2020) who follows up on specific algorithmic techniques of information ordering against their conceptual and historical background, showing how specific “units of knowledge” (2020: 17) repeatedly find application. In CDS, research has focused on specific algorithmic systems, studying these either through documentational material, for instance investigating design constraints (Polack, 2020), or through ethnographic studies of data science projects. CDS scholars have shown how norms and values find their way into the design of ML systems (Grosman and Reigeluth, 2019; Passi and Sengers, 2020) and the training data sets they are based on (Denton et al., 2021). They revealed how biases are a necessary part of the learning process of ML systems as part of “ground-truthing practices” (Jaton, 2021).
As wide as the field disperses, methodological approaches to researching software, algorithms, and data vary. Kitchin (2017) summarizes different possibilities to analyze algorithms. While one can perform ethnographic work, carrying out interviews with developers and designers or conducting an ethnography of a coding team, there is also the possibility to examine pseudo or source code, or even reflexively producing code oneself. Especially the strand of critical code studies, proposed by Marino (2020), is immersing into concrete lines of code and their wider cultural meaning. Another approach Kitchin (2017) and Rogers (2013, 2019) propose is the method of reverse-engineering to generate knowledge on black-boxed or proprietary software, for example by recombining and repurposing digital tools to follow up socio-technical questions. These digital tools are also applied for algorithmic auditing purposes 2 (Sandvig et al., 2014) to assess how algorithms are “negatively impacting some interests (or rights) of people” (Brown et al., 2021: 2). Scholars of machine behavior apply methods such as “randomized experiments, observational inference and population-based descriptive statistics-methods,” to estimate the societal impact of algorithms (Rahwan et al., 2019: 479).
Positioning himself between concrete code and technologies on the one hand and the analysis of their societal effects on the other, Rieder (2020) proposes to study “algorithmic techniques” as the essential concepts behind specific algorithmic systems. He suggests directly investigating the native material of computer science to research the “rich reservoirs of existing knowledge” computer scientists and software engineers draw from for their configuration of ML algorithms and classification systems. These techniques are—in opposite to specific code underlying algorithms—quite accessible, for instance in “scholarly publications, software libraries, and communities of practice” (Rieder, 2020: 100, 13).
Current research often focuses on concrete entities (i.e. specific data sets or developer teams around specific algorithmic systems), gaining knowledge around ML algorithms after they have already taken their concrete form. In contrast, we trace the knowledge, the common technological concepts as well as the assumptions and values that go with them, present in online ML courses. Heuer et al. (2021) have already conducted empirical research on online tutorials as a site of self-educative material, using a quantitative approach to analyze how ML as a term has been constructed and what implications this framing has for critically investigating ML systems. This study provides a first insight into the narratives of ML in self-education material, such as the idea of a universal applicability of ML and the encouragement of applying ML without specialized knowledge on the side of the practitioner (2021: 11). While our research partly confirms findings of Heuer et al. (2021), we aim to shed light on online ML courses, a topic that has not yet received scholarly attention, and their political-economic context. Instead of a quantitative mapping of ML framings, we provide a qualitative study of the narratives around ML and their position in the power dynamic of the AI industry.
Following up on previous research on the economics of the AI industry (Dyer-Witheford et al., 2019; Zuboff, 2019), we study how Big Tech companies such as Google and IBM increasingly position themselves as educators in the field. These companies argue that they are crucial players for the advancement of technologies that “benefit people and society” (Google AI, 2021). Here, the call for a “democratization of AI” (Burkhardt, 2019) through access to educational material and ML infrastructures plays an important role in their to advance its infrastructural power (Plantin et al., 2018). Against this background, we pose that educational material is an important area to not only study the core technological concepts and assumptions guiding ML research and development, but also to understand AI as an economically disputed field more generally.
Online courses as research object
MOOCs and their political-economic dimension have been researched extensively (i.e. Hall 2015). In this paper, however, we focus particularly on online courses specialized in introducing ML to beginners. In the last decades, the application of ML technologies has grown immensely and consequently the demand for skilled ML practitioners. As Heuer et al. (2021: 4) state, “only about one per cent of software developers have the skills to engage with AI and ML as novel paradigms of programming.” Hence, “self-education in Machine Learning can be expected to increase significantly considering its growing importance and demand for workforce.” Particularly in ML, online resources are of importance, as the field is changing continuously. A 2021 industry survey among developers shows that 60% of the around 80,000 respondents learn how to code from online resources, such as videos and blogs. 40% stated they participated in online courses or certifications (Stack Overflow, 2021). This illustrates the importance of these courses in contemporary software development.
The two courses selected for this research are provided by the Big Tech companies Google/Alphabet and IBM that both significantly contributed to the development of AI in the last decades. They belong—with Amazon, Apple, and Facebook—to the six dominant companies in AI research and development in the US (Dyer-Witheford et al., 2019: 36). 3 Google integrates ML algorithms in a variety of its widely used products, as for instance in Google Search, Google Photos, and in its YouTube recommender system (Google Developers, 2017). IBM provides businesses and industries with ML infrastructures on an international level, for example, with cloud database environments as well as predesigned data models for analytics tasks in different fields (IBM, 2021: [2]). 4
As Dyer-Witheford et al. (2019: 43) state, companies such as Google and IBM profit from a tremendous position of power within the AI industry, due to their “[control] of cloud computing facilities, ownership of large data sets, and the wealth to hire the best from a limited pool of AI talent.” In the field of AI- or ML-as-a-service, Google and IBM belong to the top companies in the market. They provide ready-to-use AI technologies, but also rent out their vast infrastructures for the training and development of ML models through cloud computing (Luitse and Denkena, 2021; Seaver, 2019). Particularly Google Cloud has a wide reach and is considered—next to Amazon's AWS—the second most-used cloud platform (Stack Overflow, 2021). TensorFlow, too, finds wide application, especially since Google decided to make the framework open source in 2015. This is a useful measure to drive ML research, which, consequently, leads to further improvement of Google's products. As Jeff Dean, lead of Google AI, stated in relation to the open sourcing of TensorFlow: “Any advances in machine learning […] will be advances for us as well” (Metz, 2015). Further, Google counts as “the most aggressive acquirer” of AI talent in the field (Zuboff, 2019: 190), recruiting graduates by offering enormous salaries. Here, frameworks such as TensorFlow are “a powerful way to get the best talent in the world to work with your companies” (Srnicek, 2019). In this sense, the decision to open source TensorFlow is also a means to train more people on its own technologies, making it easier for them to work for Google at a later point.
IBM centers its AI efforts on its platform IBM Watson. Here, the company provides tools for model development, ready-to-use AI applications, and APIs for enterprises who seek to implement AI into their processes (IBM, 2021: [5]). This allows IBM to draw in a variety of different companies into business—from healthcare to the car industry, from telecommunication providers to education—providing them with their AI systems (IBM, 2021: [6]). In 2020, the company has “more than 40,000 Watson client engagements across 20 industries, where market leaders are using IBM Watson to work smarter” (IBM, 2021: [1]). However, IBM does not pursue the same open-access strategy as Google. According to Crosby (2018), this is one of the main reasons why Google is far more successful with its products, as “[successful] open-source projects attract developers and researchers, and successful ML open-source software projects become focal points of innovation for the industry.” This is also reflected in the 2021 Stack Overflow developer survey: while 31% are working with Google's Cloud Platform, only around 2.5% of developers use IBM Cloud/IBM Watson.
Given their prominent position as well as the outreach of their online courses, 5 how Google and IBM represent ML has a major impact on how it is dominantly negotiated within research and development in general. While online courses thus generally only form a small part of a developer's self-education, the analysis of the representation and conceptualization of ML in the courses provides us not only with a broader insight into the main technological concepts of the field, but also into the socio-economic interests of the companies and the subsequent consolidation of their influence in the development and research of ML.
Methodology
To address the interactive environment of online ML courses, we combine a walkthrough method with a CDA on the material the walkthrough generates. With their “walkthrough method,” Light et al. (2018) propose an approach for the critical investigation of software applications. Drawing from Science and Technology Studies (STS) and cultural studies, the method allows for an analysis of an app's context—its vision, operating model, and governance—and for a systemic analysis of its components and intended use through a consecutive “technical walkthrough” (2018: 891). As the framework has been created for the investigation of apps, it needs adaptation in some respects for its application on online courses as intended in this paper.
We first establish the context of the online courses, which helps to estimate the economic interests of the Big Tech companies providing them and how these, in turn, shape the courses’ design. This first step of the walkthrough method reveals detailed information on the course components and participant requirements enabling critical insights into the target user base and presented purpose of the online course—as, for instance, the notion of making AI accessible. This constitutes the vision of the application (Light et al., 2018: 889). On a second level, we gain access to the operating model behind the course, which involves the companies’ business strategies and revenue sources. This indicates the providers’ political or economic interests (Light et al., 2018: 890). Revenue sources in relation to ML might be participation costs or further costs incurred in relation to certificates, processing resources or further learning content. Also indirect revenues are considered here, as for instance a wider reach of the companies’ ML technologies and the training of possible future AI talent. Last, we analyze the mode of governance of the application, which concerns the management and regulation of participants on the side of the providers “to sustain their operating model and fulfill their vision” (Light et al., 2018: 890). For online courses, this translates to the questions of what is required to participate and how openly the courses are designed—that is, to what degree is interactive participation possible? This helps determine whether the courses give users space for the development of their own ideas and experiences or, in the opposite, if users are only able to learn and act within the given environment. The three aspects of vision, operating model, and governance will generate first insights into the diverse aspects of the online ML courses’ socio-economic environment, drawing from the provided course information as well as advertising materials on the companies’ websites.
The concluding technical walkthrough generates as much data about the research object as possible and thus offers detailed insights into the courses’ content. By assuming a user's position, one follows through these online environments and identifies the key components and narratives of the course. In this regard, we completed each online course by following the lectures, reading the accompanying explanations, and participating in programming exercises. 6 For a structural analysis, we systematically collected the themes and concepts we found in a template of descriptive fieldnotes for each course. These were structured into five main categories: technological concepts, examples, narratives, objectives as well as references to outside material and theoretical ideas. The walkthrough provided a detailed insight into the wide array of knowledge—or algorithmic techniques—present in the online courses. This not only entails information about infrastructures and technological concepts of the field (i.e. commonly used ML algorithms, programming frameworks, and software libraries), but also about the general understanding of ML systems and the assumptions leading their design (i.e. concerning the myriad of decisions a data scientist needs to take in data preparation and model training).
We complemented the technical walkthrough by a CDA of the achieved material to study the discourses established around ML within the online courses. Hence, after the completion of the walkthrough, following Carvalho (2008) we looked for repetitively applied objects, actors, language and images, discursive strategies, and ideological standpoints in our structured data which we categorized under subthemes separately for each course. This was followed by a comparative-synchronic analysis (Carvalho, 2008) of the narratives of the two courses to identify what language is used across the courses and what images are commonly evoked. Through our CDA, we identified three specific narratives in the representation of ML: (1) ML being a universally applicable, easy-to-apply and efficient tool; (2) the emphasis of the practical nature of the ML developing process, particularly in regard to technical shortcomings; and (3) a clash between the AI fairness discourse the companies’ advertise and the values of productivity and performance found within the courses.
The CDA allowed us to draw conclusions as to how these discourses are related to the power relations in which they emerge and how these manifest the dominant positions of the companies offering the courses (Catalano and Waugh, 2020; Jäger and Maier, 2016). We thus analyzed the found thematic clusters and connected them back to the aspirations and interests of the companies drawn from our contextual analysis. This analysis gives an indication as to how the narratives are influenced by the position of the companies and their corporate interests, but also as to how these themselves affect further development and research of ML.
In the following, we will present our findings. The first section predominantly draws on our contextual analysis, presenting the courses’ vision, business model and governance and their relation to the economic interests of the course providers. Here, we will also present commonly applied technological concepts and ML infrastructures. In the subsequent section we will present our findings from the technical walkthrough and CDA to analyze the conceptualization and representation of ML within the online courses and its implications for the wider field.
Context of online ML courses: set-up and participant requirements
In March 2018, Google released its MLCC, following the philosophy that “AI will have the greatest impact when everyone can access it” (Google AI, 2021). In this vein, the course is cost-free and does not require a user account. Rather, it is an open environment, described as “self-study guide” (Google MLCC, 2021: [7]). The goal of the MLCC is to “build intuition in fundamental machine learning concepts.” Hence, what is required for the course are not “programming skills,” but rather a “technical mind” (Rosenberg, 2018). In this way, the course addresses a quite broad target group, conveying the idea that nearly everyone can learn how to do ML without prior knowledge required. The course thus attracts a large audience—which serves corporate interests.
IBM's course Machine Learning with Python (MLWP) is provided on the MOOC platform Coursera, requiring participants to create a Coursera account. The course can then either be audited for free or attended in the frame of a Coursera subscription. The latter enables the participants to receive the IBM Professional Certificate, which verifies their acquired skills in the job market (Coursera, 2021: [3]). The goal of the course, as outlined in the course description, is to give an overview of relevant ML concepts with real-world examples. For this, the course requires initial knowledge in data analysis with Python (Coursera, 2021: [6]). Furthermore, and contrary to Google's “open” philosophy, the MLWP course explicitly states its focus on the preparation of participants for the job market, teaching them to integrate ML into businesses and industries. To this end, the course is embedded in a larger network of training activities and certificates within the “IBM Training program.” After completion, the participants are invited to “IBM's Talent Network,” which connects them with “the tools you need to land a dream job with IBM—sent directly to your inbox!” (Coursera, 2021: [4]). The course consequently explicitly encourages people to enter the field of data science—preferably at IBM—which reflects the great shortage in AI expertise.
The content in both courses is conveyed primarily one-directionally, allowing only little room for reflection on or engagement with what has been learned. This is the case particularly for the pre-recorded lectures and text passages the courses mainly rely on, which reinforce a mere adoption of the content by the participants. Additionally, the technological concepts are practically applied through programming exercises, as well as—in the case of Google—through TensorFlow Playground. This “program that visualizes how different hyperparameters influence model (primarily neural network) training” (Google MLCC, 2021: [5]) lets participants experiment with ML principles in a more playful way. It acts primarily on a visual and abstract level, obfuscating the programming elements of ML. Especially for participants without programming knowledge or technical expertise, TensorFlow Playground offers an engaging way for better understanding the logic behind ML models. A comprehensive technological understanding, however, is not established.
For the programming exercises, the courses follow a rather low code approach. Not only is it relatively easy to conduct the programming examples through simple copy and paste, but also in more advanced settings are ML models often imported from common libraries (with Google TensorFlow and Sci-Kit learn among the most popular) without a reflection of their concrete functioning and development. Consequently, certain technological concepts find repeated application. As Rieder (2020: 15) exemplifies, this practice is very common in the making of software. Exactly these methods can be regarded as recurring algorithmic techniques, leading developers to “draw on technicity and knowledge that they understand only in broad terms or not at all.” The libraries “enable software-makers to step further faster, not merely regarding resource efficiency but in terms of what can be considered possible in the first place. Such packages widen the spaces of expressivity, broaden the scope of ambitions, but also structure, align, and standardize.” Mackenzie (2017: 98) also emphasizes the reliance on off-the-shelf models in the development of ML systems as common practice. Referring to a specific algorithm, the “stochastic gradient descent,” he underlines that “the point is not to read and understand them directly […]. Many people who directly use machine learning techniques in industry and science would […] mostly take them for granted and simply execute via functions supplied by software libraries (e.g. GradientDescentOptimizer in the TensorFlow library or StochasticGradient in torch).”
These practices of reusing pre-built models are reinforced by the online courses. Given the economic context, particularly TensorFlow is of interest for this research. As stated on the official website, the library makes it possible for “researchers [to] push the state-of-the-art in ML and developers [to] easily build and deploy ML powered applications” (TensorFlow, 2021). It allows for an easier integration of research and development of products, offering “one set of tools that researchers can use to try out their crazy ideas and if those ideas work, they can move them directly into products without having to rewrite code” (Google, 2015). 7 In this way, particularly beginners might easily become reliant on TensorFlow, as they only learn how to implement existing models instead of developing them themselves. Also more experienced programmers are drawn into Google's TensorFlow environment, so that the course serves as good opportunity for Google to establish its ML products. IBM shows a similar tendency. For completion of the course through a final project, one must create an account in IBM Cloud and IBM Watson Studio to get a free trial. Also, IBM Object Storage is required to access and save data (IBM MLWP, 2021: [18]). The focus of IBM with the online course thus seems to be particularly in the marketing of its ML infrastructures. Here, the focus lies explicitly on the employees of different businesses to train in ML using IBM products, so that these further find a greater market within different businesses and industries.
As phrased in the beginning of the chapter, Google AI grounds on the belief that accessibility is key for a productive development of AI and ML. However, as this analysis shows, companies such as Google and IBM rather solidify their monopoly status in ML infrastructures, so the question of access is rather limited to their own products.
Narratives and representations of ML
As Crawford (2021: 7) describes, “each way of defining artificial intelligence is doing work, setting a frame for how it will be understood, measured, valued, and governed.” Hence, studying how companies such as Google and IBM frame AI and ML allows for estimating how these representations might in turn influence and direct current ML research and development, as well as the societal effects of the companies’ products. As a result of our CDA, we identified three dominant themes that we will critically discuss against the background of the current state of research surrounding ML algorithms and the AI industry. The themes and their specific narratives are represented separately here—however, they are deeply interrelated.
The potential of ML: a universally applicable, easy-to-apply, and efficient tool
In the welcome video of the IBM course, the instructor lists several areas for the application of ML, such as healthcare, finance, and the automotive industries, emphasizing the technology's universal applicability (IBM MLWP, 2021: [8]). In the subsequent video, he continues: “One could easily presume that only a doctor with years of experience could diagnose that tumor and say if the patient is developing cancer or not. Right? […] This is machine learning! It is the way that a machine learning model can do a doctor's task or at least help that doctor make the process faster” (IBM MLWP, 2021: [5]).
Likewise, Peter Norvig, director of research at Google, describes ML in the MLCC introduction as a “valuable tool.” He lists three reasons. First, that ML allows to reduce the time data scientists and engineers spend in formalizing rules by instead using “off-the-self machine learning tools.” Second, that it offers the possibility to customize and scale products, “making them better for specific groups of people.” And third, that “machine learning lets you solve problems that you, as a programmer, have no idea how to do by hand” (Google MLCC, 2021: [6]). What Norvig hints at with his last point is the shift from symbolic AI—where the programmer defines specific rules according to which the algorithm transforms input to output—to the connectionist approach of ML, where the algorithm recognizes patterns in large amounts of data from which it draws generalizations that it applies to new input data (Mitchell, 1997). What is further conveyed is the belief that ML models—and this aligns with the image conjured within the advertisement of the companies’ AI departments in general—are powerful mechanisms that enable us to solve societies most pressing problems. This narrative goes along with a recurring emphasis on the efficiency and simplicity of ML algorithms, namely, that it “works very well” (IBM MLWP, 2021: [6]) and that it is “really fast, it's extremely efficient to train and it's efficient to make predictions” (Google MLCC, 2021: [22]).
Even more interestingly, the objectives of efficiency and simplicity do not only refer to the broader practice of applying ML systems—they also find entry into the design logic of ML models themselves. As stated in the MLCC: “[…] a model should be as simple as possible” (Google MLCC, 2021: [21]). Balancing between the two common phenomena of overfitting and underfitting when applying ML algorithms—meaning, that the model is either trained too narrow, only recognizing what it already knows, or too broad, viewing everything as relevant—Google describes: “The fundamental tension of machine learning is between fitting our data well, but also fitting the data as simply as possible” (Google MLCC, 2021: [4]). 8 In the narration of the online courses, ML models are thus not about the best possible representation of reality, but rather of its simplest version. What is not reflected though is that these representations can only ever be partial, as they are the result of a series of subjective decisions that the data scientist must take in the process. Consequently, and contrasting to the notion that “it works very well,” ML technologies always inhabit processes of the marginalization and exclusion of certain entities (or people) in favor of others, resulting in a number of discriminatory problems.
However, this social context is not taken account within the online ML courses. Agency and responsibility are not attributed to the data scientist taking the decisions or applying the algorithms, but rather to the ML technologies themselves. In the MLCC, the ML model for instance is defined as “the thing that is doing the predicting” (Google MLCC, 2021: [20]). At another instance, it is phrased that the programmer herself is said not to need much expertise, as she does not “need to tell the algorithm what to do” (Google MLCC, 2021: [6]) but can rather show it a set of examples from which it can learn itself. In this sense, the participants of the course are also informed that they do not necessarily need to know how the techniques work in detail but can simply import the necessary models from ML libraries. In relation to the application of neural networks and the method of backpropagation, for instance, it is stated in the Google MLCC: “One thing that you don’t need to know about back prop is how to implement it. That's one of the brilliant things that TensorFlow does for us, is it takes the internals of back propagation and does that all for us underneath the hood” (Google MLCC, 2021: [24]). Emphasizing the agency of ML libraries, models and algorithms at this point supports the notion that biases that might arise with algorithmic systems based on ML or other methodological limitations that come with these technologies are not a result of the social context of their development. Rather, they are considered technical errors. In this way, both companies and participants are stripped from the responsibility of a more critical engagement with the technological concepts. These problematic narrations become even more apparent by having a closer look at the process from data preparation over model training to the concluding classification and predictions tasks.
The practical nature of the ML developing process: exploration, experimentation, and technical conundrums
Both ML courses convey the image that there is a model for every problem, as long as enough data is available to conduct the learning process or, as phrased in the MLCC: “If we have a very large data set, things are good” (Google MLCC, 2021: [23]). This underlying idea becomes clearer when we place it into the context of network science and the belief of “The End of Theory,” as Anderson (2008) phrased it in an often-cited WIRED article. He argues that the knowledge that can be drawn from data sets makes scientific theorizing obsolete. Instead of looking for hypotheses or causal explanations for the relation of data points, “correlation is enough.” At the same time, the courses present how in ML, these correlations are not always apparent right away, but must be found through a series of trials and errors. As Norvig elaborates in the introduction of the MLCC: Machine learning changes the way you think about a problem. Software engineers are trained to think logically and mathematically; we use assertions to prove properties of our program are correct. With machine learning, the focus shifts from a mathematical science to a natural science: we’re making observations about an uncertain world, running experiments, and using statistics, not logic, to analyze the results of the experiment. The ability to think like a scientist will expand your horizons and open up new areas that you couldn’t explore without it (Google MLCC, 2021: [6]).
The limitations of this experimental process—as well as the ambivalent nature of its results—are addressed throughout the course. Instead of a genuine confrontation with them, however, often only technical solutions are proposed. In the MLCC, for instance, the instructors acknowledge that data sets are often unreliable. Thus, in order to mitigate “dirty data”—for example, in the form of omitted values, duplicate examples, or bad labels (Google MLCC, 2021: [19])—the data needs to be “cleaned,” for instance through their removal from the dataset. Another mitigation of the problem is presented in the necessity to “know your data” (Google MLCC, 2021: [25]). However, this “knowledge” does not necessarily imply the investigation under what criteria the data has been selected and what assumptions might have guided the process. Instead, rather formal problems within the data are being addressed. Additionally, “knowing your data” goes along with “[keeping] in mind what you think your data should look like” (Google MLCC, 2021: [19]). This possibly aligns with assumptions that might have already been formed concerning the respective task. 9 As D’Ignazio and Klein (2020: 66) point out: “[…] data science is often framed as an abstract and technical pursuit. Steps like cleaning and wrangling data are presented as solely technical conundrums; there is less discussion of the social context, ethics, values, or politics of data.”
In the ML courses, we can find an apparent clash between the mediated philosophy of ML around their presumed potentiality and functionality on the one side and the explicit narration of potential shortcomings on the other. Even though these are being addressed, there is no intensive engagement that serves for a deeper reflection of ML as methodology. Rather, models are evaluated based on their best possible accuracy and biases are mitigated through technical fixes. Here, particularly the shift of agency towards the technologies themselves stands in stark contrast to the many tasks the feature engineer must perform—from data preparation to model adaption—as well as the myriad of assumptions and decisions she must take—for example, that closeness of data points indicates similarity—in order for the ML models to function correctly. These socially inflicted procedures are often the cause for issues such as algorithmic discrimination, as the current state of research shows. However, as we will show in the following section, the discussion of fairness in AI in the courses, too, is grounded in a belief of technological solutionism.
Fair AI versus values of productivity and performance: the best possible fit
Both Google and IBM predominantly conjure the image that they are aiming for responsible and fair AI practices. At the end of the MLCC, for instance, Margaret Mitchell—former co-leader of the Ethical Artificial Intelligence team at Google 10 —is introducing the participants to the notion of fairness in AI (Google MLCC, 2021: [2]). Similarly, IBM poses: “Moving forward, ‘build for performance’ will not suffice as an AI design paradigm” (IBM Research, 2021). In contrast to these narratives of fair AI approaches stand the dominant values that appear in the courses, which are the model's performance in terms of efficiency and simplicity as well as its evaluation according to the accuracy of their prediction. Here, the representation of ML enables the companies to deal with the problems and limitations that come with ML models only in a confined, technical way, stripping them from the responsibility to take true accountability for their models.
Another critical point is Google's statement that the technologies are developed with “everyone's benefit in mind” (Google AI, 2021). As the authors of the book Discriminating Systems (2019) describe, ML systems are built “almost exclusively in a handful of technology companies and a small set of elite university laboratories, spaces that in the West tend to be extremely white, affluent, technically oriented, and male” (West et al., 2019: 6). At the same time, the company has been frequently struggling with discriminatory issues arising from their technologies in the last years (Noble, 2018). However, there is no reflection on the role the companies themselves play in the research and development of ML. Rather, the presented perspective is regarded as default, and thus the values connected to ML within the courses, such as for instance its potential, but also how models and data are assessed, become the dominant narration. As Rieder (2016: 51) describes, “[…] the legitimacy of data-driven decision-making hinges not only on the presumed objectivity of its methods, but on the unquestioned acceptance of productivity, performance, merit, and, in short, of ‘economic morality [as a] guiding logic that conditions and directs our daily lives’.”
The background of those that provide ML infrastructures has a great influence not only on how ML is represented but also how technological concepts are designed. If models are constructed and classifications made with aspirations such as efficiency and performativity in mind, the results will differ greatly from when, for instance, an approach is sought where also marginalized groups benefit (D’Ignazio and Klein, 2020: 63). Similarly, the goal to represent reality in the simplest manner possible contradicts with the complexities our society consists of. The models developed hereby always underly the partial views of their creators. Thus, it is important to ask: “Who does the work (and who is pushed out)? Who benefits (and who is neglected or harmed)? Whose priorities get turned into products (and whose are overlooked)?” (D’Ignazio and Klein, 2020: 47)
Conclusion and future research
Previous research has intensively engaged with the social implications of ML algorithms. Scholars have underlined the problem of algorithmic discrimination by tracing societal implications (Eubanks, 2017; Noble, 2018), investigating technological operations (Apprich et al., 2019; Bechmann and Bowker, 2019; Buolamwini and Gebru, 2018) and proposing ways to increase transparency, accountability, and explainability (Hutchinson et al., 2021). We augment this research by focusing on online ML courses as a research site that enables insight into technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice. In this sense, we add to present research on ML—often revolving around specific algorithmic systems and technological processes—by focusing on common algorithmic techniques (Rieder, 2020) and ML representations (Heuer et al., 2021). Mackenzie (2015, 2018) and Burkhardt (2019) already investigated the broader technical principles of developer work and its platforms. Here, we introduce a political-economic dimension as we not only explicate dominant themes and corporate ML infrastructures, but also show how these are influenced by Big Tech. Additionally, we open up onto ML education as a new field of research that has not been acknowledged so far. In this regard, we contribute to a growing field of research around the infrastructural power of Big Tech going beyond the focus on platforms and development practice.
Google and IBM have a tremendous position of power in the AI industry, setting the standards in ML research and development. With their online ML courses, they expand their power into the field of education. We show that they reinforce their position here by not only recruiting new AI talent, but also by securing their infrastructures and models to become the dominant ones. This takes place particularly through the mediation of ML as simple tool that requires no profound knowledge for its application, while at the same time enforcing the implementation of the companies’ off-the-shelf ML models and infrastructures. We further show that the companies have an enormous influence on how ML is understood and represented and consequently according to which values and directives its technologies are designed. Our analysis reveals how ML models are advertised through their universal applicability and the simplicity of their implementation. Knowledge and expertise are not attributed to the developer, but to the model itself, which potentially leads to faulty systems being countered only with technical solutions. At the same time, companies boast an image of fair and democratic AI. This, however, stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity Google and IBM strive for with their infrastructures.
In the face of ever-emerging discriminatory problems with contemporary algorithmic systems, it is of particular importance to address their underlying technologies as well as the economic context in which they are created. Our research underlines the importance of asking what the implications are when Big Tech corporations consolidate their position in the AI industry and make their ML infrastructure indispensable. What is significant here is that it is becoming increasingly difficult to achieve necessary data and infrastructures beyond Big Tech, but also that these companies are taking over the education of future ML practitioners by poaching them from universities and training them on their products. The analysis of online ML courses underlines the importance of these questions—however, this research also underlies a couple of limitations. While the importance of online courses can be estimated from surveys such as the Stack Overflow Developer Survey, it is hard to estimate the actual outreach and reception of the particular courses under consideration. Further, it would be important to investigate how users deal with the knowledge they have learned: do they critically reflect on the content, and might there be subversive practices?
Researching online ML courses gives further insight into how we could imagine ML algorithms differently. Here, online courses are just one part of a larger collection of educational resources on ML. While Heuer et al. (2021) have already researched online tutorials, further work should be done on the reception of ML introductory books, online videos or technology blogs and fora. It would also be of particular interest to have a look at ML courses from the European, Asian, or African context, tracing alternative narratives and approaches on ML that might offer a counter perspective. The leading questions in this research would be to what degree we are able to develop ML technologies that do not rely on currently dominant infrastructures and how we might implement alternative epistemologies into ML algorithms that are outside of the commercial realm. The decidedly technological perspective adopted in this paper plays an important role in this endeavor.
Supplemental Material
sj-pdf-1-bds-10.1177_20539517231153806 - Supplemental material for Learning machine learning: On the political economy of big tech's online AI courses
Supplemental material, sj-pdf-1-bds-10.1177_20539517231153806 for Learning machine learning: On the political economy of big tech's online AI courses by Inga Luchs, Clemens Apprich and Marcel Broersma in Big Data & Society
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
