Abstract
This paper re-examines the concept of affordance within the context of generative AI, challenging the recent embodiment framework of human-computer interaction. Drawing on Gibson’s ecological psychology and Actor-Network Theory, it proposes a ‘fold diagram’ to illustrate how generative AI fundamentally alters the relationship between users and digital environments. Unlike previous tools requiring user skill for specific outcomes, generative AIs, exemplified by Compton’s generative art toy Idle Hands, procedural content generation, and LLM-based AI chatbots, respond meaningfully even to unintentional actions or underdeveloped queries, facilitating a new trade-off where control is ceded for increased power. The paper argues that affordances are no longer opportunities for action discovered by the user, but rather the machine’s capacity to flesh out predicted user interests. This shift is viewed through the lens of a radical symmetry between human and AI agency, where both actively fold the digital web to create intra-active niches. Ultimately, the paper presents a critical theory of HCI, asserting that software capitalism leverages generative AI to continuously re-world the web, preserving and exploiting pre-individuated user interests for profit.
Affordances of generative AI
Idle Hands is a generative art toy created by digital artist and game designer Kate Compton. Initially designed for an installation at an art festival, Idle Hands used a motion-tracking device to transform the exact hand gestures of the audience into giant skeleton hands projected on a wall (Figure 1) with no perceivable delay (Compton and Mateas, 2018). The ‘generative’ aspect of this art toy lay in its presentation: every tiny perturbation from each joint of the skeleton hands spread tessellating patterns across the wall, as if their clenching and unclenching moves were folding space infinitely. Without any pre-existing objects for pre-defined moves to grasp, players experienced toying with this generative algorithm as a constant reworking of their body-world coupling from the cuts drawn across these joints, which we call fingers but were then embedded in a literal plane of immanence. The pleasure of toying with this artwork was not about players’ skillful grasping, but rather the algorithm’s responsiveness to unskillful user actions and its ability to generate outputs graspable even by unintentional gestures. Kate Compton, Idle Hands. Captured by the author from the internet version available at https://galaxykate.com/apps/idlehands/.
Unlike other art tools like Photoshop, which give users total control over the output, Compton notes that the audiences for her creations ‘are willing to cede control to a generator, as long as it gives them more power’ (2017, 164). The new sense of empowerment her generative AI promises to players is thus a trade-off for their previous habits as tool users. This detachment of power from control suggests a changed meaning of poiesis, or the act of creation at the human-computer interface. It used to be under the grasp of skillful users’ embodied knowledge of software’s affordances, as Ivan Sutherland’s Sketchpad prototype (1963) exemplifies. Now creating is a more robust process where users simply ‘feel engaged’ as providers of random seeds for algorithmic generation (164). From this changed condition of human tool use that generative AI puts us into, this paper retheorizes affordance not only as a concept describing the practical meanings of the AI-generated web, but also as a critical tool to rethink how the current software industry profits from our effortless and intentionless engagements that AI affords.
Since Don Norman’s instrumentalization as a concept in human-computer interaction (HCI), affordance has often been used to indicate certain design elements that developers put on the software interface to ‘determine just how the thing could possibly be used’, such as buttons to click and brushes to drag (2002, 9). While previous software tools with these action-affording properties could bring desired effects only in response to the right actions users performed, the new software applications generative AIs promise are conversely autopoietic, as users here ‘have only a triggering role in the internally determined activity’ (Humberto and Varela, 1980, xv). For example, to videogame developers, procedural content generation (PCG), a genre of generative AI Compton aims to publicize through her art toys, enables more expansive and interactive design of the game world, commonly called a sandbox (De Araujo and Souto, 2017). To deliver the unlimited playability the term ‘sandbox’ suggests, the game world must include a sufficient variety of geological formations and biomes to accommodate all events, quests, stories, and small achievements promised to gamers. For this, PCG can generate unlimited new assemblages of these playable elements in response to certain user-caused perturbations, ranging from the gamer’s choice of random numbers called ‘seeds’ that initialize a pseudorandom generation of a game world to simple gamer actions, such as an avatar stepping into a new area.
In an ideal case, the algorithmically generated sandbox affords meaningful responses even to an avatar’s meaningless wandering or adventitious behaviors, allowing the player to develop emergent gameplay between actions taken and perceptions of how the world responds. This new digital landscape does not merely ‘offer’ players opportunities for meaningful actions; it constantly regenerates the world’s interface to make even unintentional actions feel meaningful in hindsight, as they engage with algorithm-led processes of worlding. PCG, as a development tool and also a representative of a broad generative AI family, including ChatGPT and other LLM-based chatbots, in this light, suggests that the engagement between H and C in Human-Computer Interaction that affordance initiates is no longer precipitated by humans’ instrumentalist concern to realign heterogeneous responses/effects of software tools around their intentions. Just as Jakob Nielsen claims regarding ChatGPT, ‘the newest paradigm’ of UX design - whose ‘Intent-Based Outcome Specification’ he contrasts with the previous ‘Command-Based’ paradigm (2023) - is characterized by its capability of generating outcomes that flesh out interests, which users often do not fully notice at the moment of interaction. Challenged by ‘this paradigm [that] completely reverses the locus of control’ (2023) from H to C is then the embodied and situated knowledge of human tool users.
The ‘third paradigm’ (Frauenberger, 2019) of HCI and its postphenomenological approach (Rosenberger and Verbeek, 2015) theorize embodiment as the process in which a tool user’s subjectivity and the world she experiences are co-constructed from her habitual incorporation or speculative projection of how the tool engages with its own world. In the new digital landscape, the user’s feeling of embodiment and situatedness, however, seems to pertain less to her ability to reorganize her actions according to how the tool responds. It pertains more to how the machine responds always in such ways to satisfy the predicted intents implied in her actions. In this reversed locus of control, affordance no longer means the opportunity for action that the user discovers from consistent responses of the machine. Rather, the machine promises opportunities virtually for any possible actions or inquiries she can take, whether the intents she notices or possible nonconscious interests the machine claims all of her actions must be reflective of. While the postphenomenological approach refuses to ‘give up’ the distinction between humans and nonhumans in agency (Rosenberger and Verbeek, 2015, 19), this machinic process to flesh out the user’s yet underdeveloped interests in prompts requires a more symmetrical relation to exist between H and C. What Idle Hands presents alongside its plane of immanence is this symmetry of agency, which I understand as one’s (no matter whether H's or C's) ability to reorganize its actions to mobilize and register consistent responses from others to fulfill its interests.
As in postphenomenology, agency on this plane is analyzed at the level of how actions precipitated by the hands are responded to by AI and vice versa. Without assuming any a priori or habitual skillset for the hands to perform to grasp the things at the fingertips as manipulable objects, it demonstrates how the boundary between subject and object is constantly recreated as the plane is folded along a certain group of actions and how they are consistently responded to by another set of actions. Affordance, or the feeling of agency that every action her hands take is involved with an infinitely creative process, is, therefore, less of a communicational event between two pre-defined relata in interaction. Instead, as Frauenberger suggests in his post-humanist ‘entanglement HCI’ framework (2019), affordance is a product of intra-action (Barad 2007), the world-bifurcating fluctuation through which inconsistent movements within a flux are aligned into a pair of actions and responses, or the process that I illustrate in this paper as the ‘folding’ of the world into two parts of grasping and grasped. 1 What makes this conceptual plane symmetrical is not such a postphenomenological speculation about the ‘mutual constitution between subjects and objects’ (Rosenberger and Verbeek, 2015, 19). Symmetry rather lies in its refusal to locate the transcendental faculty of worlding exclusively on either side of the fold, whether it be the hands’ ability to reorganize habitual gestures according to how the surrounding responds, or PCG’s ability to generate responses graspable even by any accidental hand gestures.
An ontological plane or flux, on which this paper rethinks the meaning of affordance, is, therefore, not to repeat another embodiment theory, not to explain all over again how a body’s habitual reorganization of its actions bifurcates her subjectivity as a tool user and the world furnished with objective tools. It is simply not what we do with generative AI today. The feeling of unlimited opportunities for actions, which characterizes the affordance of generative AI, is even allowed with no embodied skill/knowledge about the AI. This fake feeling of empowerment is rather created as the AI folds the web always into the perfect niche in which the user feels, in hindsight, that her interests, yet underdeveloped but just prompted, were well responded. Instead of a postphenomenological ‘ethical theory’ (Rosenberger and Verbeek, 2015, 31), this symmetrical framework contributes to a critical theory of HCI. Don Ihde (1990, 200), the founder of postphenomenology, says his ethical theory could bring the ‘gestalt switch in sensibilities … from within technological cultures’ to encourage humans as tool users and designers to think of other possible opportunities for actions differently embodied technologies could afford to their differently situated users. On the other hand, this paper’s critical theory focuses on how the web, imagined to be another plane of immanence, is folded by AI into multiple niches – where intra-actions occur, thus intra-faces – with unlimited affordances, in exchange for the control we used to exercise as old-fashioned tool users.
Affordance in this paper is thus not a concept about objects as tool beings, but, as implied by Gibson’s original ecological conceptualization, a concept to describe certain emergent features of the world from our dwelling. In this respect, taskscape, the term Tim Ingold (1993) suggested to illustrate the world as multiplicities of Gibsonian action opportunities, is notable as a starting point to rethink affordance from the immanence of the world and its intra-active folding of human niches. He says a pragmatically meaningful ‘landscape’ that provides all the opportunities for actions necessary to sustain one’s way of life results as ‘a congealed form of taskscape’ from someone’s actual ‘dwelling in the world’ (162). Likewise, the sandbox that PCG generates also becomes functional as the player realigns how its lands, biomes, and NPCs respond to her actions. However, when Ingold claims that affordances embedded in his anthropologically imagined plane of immanence are actualized only as the exclusive consequence of humans’ or animals’ ‘constitutive acts of dwelling’ (158), it also, like the postphenomenological HCI, jumps to an asymmetrical embodiment theory. On the other hand, the procedurally generated landscape is full of meaningful responses even to the gamer’s meaningless wanderings, just as the web on Google Search, now integrated with Gemini, its AI model, is capable of fleshing out our search interests even before our fully embodied Googling finger manually scrolls down search results pages.
In the section that follows, I will critically reexamine this embodiment-centered interpretation of affordance and reread the world’s intra-active becoming implied in Gibson’s ecological psychology from the perspective of actor-network theory. The radically symmetrical relational ontology of ANT is what both Ingold (2011) and postphenomenologists (Rosenberger and Verbeek, 2015) consciously distance their versions of the world of immanence from. On the other hand, ANT’s principle of symmetry allows this paper to interpret the multiplicities of inconsistent movements on a plane that Idle Hands aesthetically presents as being expressive of two distinctive less-than-or-more-than-human interests thriving on the current AI-generated web. On one hand, there are pre-individuated interests of users, those yet undeveloped through well-calibrated actions and perceptions of a body but persisting in between. On the other hand, nonhuman interests of AI, which fold human niches within the web to help us individuate our otherwise nonconscious interests. In the second part of the paper, this examination leads to the critical reappropriation of affordance to describe how software capitalism after generative AI reprograms Human-Computer Intra-faces to preserve, mobilize, and cultivate these profitable pre-individuated interests between our toying hands and wandering eyes.
Folding affordances like ANT
In a chapter of Being Alive (2011) titled ‘When ANT meets SPIDER’, Ingold discusses ‘Social Theory for Arthropods’, using SPIDER (Skilled Practice Involves Developmentally Embodied Responsiveness) to argue against ANT (Actor-Network Theory). In the preceding chapter, Ingold addressed an inherent contradiction in Gibson’s ecological psychology – specifically, its unsuccessful distinction between ‘the animal environment’ that ‘exists only in relation to’ a living system inhabiting it and the ‘physical world’ existing regardless of animal interests (77, emphasis in original). Gibson nevertheless considered action possibilities the environment offers to animals – in other words, affordances – as real as physical reality, and, according to Ingold, that was the source of confusion.
Ingold asserts that Gibson, to reconcile these two worlds, assumed that ‘the environment comprises a world furnished with objects’ that exist ‘independently and in advance of the creatures that come to inhabit it’ (78). These furniture-like objects in the physical world influence which creatures would find their niches, from where their species-specific actions are selected as better responded to by certain objects. But Ingold finds it dissatisfying that this realist interpretation of affordance came at the expense of the concept’s initial relational explanation of system-environment co-construction out of the immanence of the world.
Through SPIDER, Ingold thus aims to demonstrate how her habitation – her life as spinning the web – transforms the physical world into a material continuum. Regarding the world that ANT, according to Ingold’s misinterpretation that will be argued later, claims to comprise stems, twigs, flies, and ANT and SPIDER themselves as each ‘self-contained object that is set over against other objects with which it may then be juxtaposed or conjoined’, SPIDER states: It is rather a bundle or tissue of strands, tightly drawn together here but trailing loose ends there, which tangle with other strands from other bundles. For the twigs or stems to which I attach these trailing ends are themselves but the visible tips of complex underground root systems. Every plant, too, is a living tissue of lines. And so, indeed, am I. It is as though my body were formed through knotting together threads of life that run out through my many legs into the web and thence to the wider environment. The world, for me, is not an assemblage of bits and pieces but a tangle of threads and pathways (91-92).
SPIDER argues that the body is not self-contained but part of the web, and what distinguishes it as a living system from other parts of the fabric is its strands drawn tightly together as a coupling of ‘movement and perception’ (94). Unlike ‘the leaf of a tree in the summer breeze’, which is merely ‘a-quiver’, the movement the SPIDER body radiates is, through its perceptual counterpart sensitive to how the web responds, continually attuned to ‘perturbations in the perceived environment’ (94). SPIDER’s body, in this sense, represents a folded part of the web between its arthropod sensors and actuators, which I called elsewhere a sensor-actuator arc (Ahn 2021a). This fold is also where new strands are woven, either as neural connections within the fold or silky fibers along its movements. This co-construction of subject and object from the folding of the web illustrates embodiment as a process in which certain patterns of perturbations consistently perceived by its sensors are registered as things responsive to its movements.
SPIDER asserts that these stranded perturbations on the web are what ANT mistakes for something objective. Furthermore, SPIDER claims that her body’s ‘attentiveness’, which makes her movements always attuned to perturbations of the web, and vice versa, qualifies the movements as actions and thus ‘qualifies me as an agent’ (94).
Besides SPIDER’s perturbation-attuned and web-folding legs providing a key likeness to the embedded joints of Compton’s Idle Hands, this fable merits revisiting for two reasons. First, the Cartesian world ‘furnished with objects’ that Ingold assumes underlies Gibson’s affordances appears to merely repeat a common instrumentalist misreading of Gibson that, I will argue, his original ecological conceptualization is not. Second, even for Ingold’s ambition to suggest an antidote to the instrumentalization of the world’s affordances, SPIDER’s arthropod social theory seems insufficiently social to enable SPIDER, ‘solitary by nature’ (89), to perceive the web’s perturbations as manifesting others’ interests – the ‘gestalt switch’ the ethical theory of postphenomenological HCI aims to bring through its emphasis on embodiment (Ihde, 1990, 200). Meanwhile, ANT, according to actor-network theorists (whose world Ingold misreads again as an assemblage of self-contained ‘entities’ (91)), would interpret all entities on the web as pre-individuated interests preceding anything objective.
Affordance as folding
Gibson (2015) conceived our perceptual environments as comprising three fundamental materials: substances, surfaces, and the encompassing medium. At the same time, his concept of direct perception suggests that these substances and surfaces, with which the world is furnished, are not simply delivered through the medium to be picked up by some mental representations. For Gibson, mediums such as earth, water, and air are the only matters that constitute a continuum of sensations that he terms ‘stimulus flux’ (238) because mediums transform every surface and substance they touch into ‘reverberations’ (13) within their molecular flux (just as SPIDER’s web turns flies into vibrations).
In his ecological optics, this flux is illustrated as ‘an ambient optic array’ surrounding a point in space and enfolding ‘an arrangement of some sort, that is, a pattern, a texture, or a configuration’ of things from all directions that reverberate through the flux (45). Visual perception then occurs as this point is occupied by an observer – not a transcendental being but a fold that can be analyzed into another array of bodily mediums with different degrees of sensibility and motility, described as ‘the eyes-in-the-head-on-the-body-resting-on-the-ground’ (195). In this materialist description of our transcendental condition, affordance for Gibson is what our perception directly picks up from this stimulus flux as it comes to the fore as ‘an invariant combination of variables’ (126) while the flux is folded along the sensor-actuator arc that the body embeds (Figure 2). In topological terms, we can perceive clearly only a certain arrangement of sensations that remains invariant under the flux’s transformation caused by our bodily action.
2
After being registered as consistent responses to this action, the recurrence of this arrangement signifies the opportunity for that action. A flux folded around a sensor and motor/actuator. Created by the author.
This folding of the flux into two parts of grasping and grasped across two distinguishable arrays of sensations, namely, ‘proprioception’ (about how bodies act) and ‘exteroception’ (about how bodies are responded to) (175), provides a concrete illustration of Gibson’s idea of the directness of perception. 3 In this light, a fold that replaces a point of view in Gibson’s ecological optics encompasses the reverberations of everything, but perceived with sufficient clarity within the fold are only those appearing invariant under its movement – only those parts of the world that offer pragmatic meanings to its habitual and intentional actions. This all-encompassing flux is where Gibson’s ecological psychology radically departs from the Skinnerian reduction of perceptual meanings to reproducible stimulus-behavior pairs. It is also where my fold diagram approaches Deleuze’s diagrammatization of the Leibnizian monad (1992) 4 (which Latour also refers to for his ANT-based illustration of data search as discussed in the following section).
In this formal description, Gibson however provides only limited explanations about the life process that facilitates the ontogenetic development of folds within the world. He notes animals’ tendency to utilize affordances to unfold from environments the niches that respond consistently to their actions and thus are exploitable by their species’ ways of life. He also states that ‘the environment as a whole with its unlimited possibilities existed prior to animals’ (120). From this realist remark of affordances that Ingold disputes, I focus on the role that these action possibilities seem to play as attractors. As the virtualities within the world of immanence, affordances induce creative occurrences of intra-actions along the folds that animals’ sensorimotor arcs inscribe. They drive the evolutionary history of the plane of immanence by constantly rewiring these manifolds through ontogenetic and phylogenetic processes to unfold more niches from the world: multiplicities of Organism-Environment Intra-faces, which fold the world into itself infinitely to unfold more pragmatic meanings of the world to itself.
Regarding this creative generation of liveable worlds, Elizabeth Grosz and Deleuzian geographers (Grosz et al., 2017) conceptualize ‘geopower’ as the attractor at the planetary level that induces living things’ creative re/de-territorialization of the landscape. Phenomenologists such as Merleau-Ponty (1962) and Dreyfus (2002) alternatively propose for this attractor some pre-individual intentions that constantly fold the ‘flesh’ of the world (Merleau-Ponty, 1968) into a coupling between body and environment to update the maximal grip between the two. However, a more radically ecological answer to this question of attractors is suggested by actor-network theory.
ANT’s fold-monism
It should be noted that, contrary to Ingold’s description of ANT’s world as an assemblage of self-contained entities, actor-network theorists such as Michel Callon and John Law use the term ‘entities’ very strategically to equivocate the ontological status of things before they are registered as actors in a network. According to Law, actors’ existence cannot be explained by their self-identities but only through their ability ‘to make their presence individually felt’ on a network (2012, 103). That means an actor ‘does not exist’ unless it continues ‘influencing the structure of the network in a noticeable and individual way’ (125). Regarding the process in which previously anonymous entities are registered as individual actors through the noticeable influences they exercise on other actors, Callon employs the term ‘interessement’, which describes ‘the group of actions by which an entity… attempts to impose and stabilize the identity of the other actors’ as those interested in its actions (1986, 207-208). In this context, their use of the term ‘entities’ is strategic as it equivocates something not yet interested in (and thus not responsive to) others’ actions. This suggests ANT functions as a descriptive tool only in hindsight, unable to specify what kind of existence these supposedly sentient entities might have until they display consistent responses to something, particularly an observer’s inquiry. At the same time, as a gesture to push ANT’s principle of symmetry to its limits, the term ‘entities’ also preserves their ontological assumption that our world should be full of those things not interested in humans but in each other, or existing as mere interests not yet mobilized by any others but manifesting non-individualized interests within the world.
Latour suggests calling these entities not yet metricized into exploitable matters of fact ‘matters of concern’ (2004), and interpreted as inconsistent multiplicities of these matters of concern, the flux in our diagram suggests a new ontological sense of affordance. Responsive or irresponsive to each other’s actions according to their own inalienable interests, each matter that makes up the flux is just another fold between its own perception and action. As a sort of monist element of the world, fold now illustrates how every entity is entangled within the flux, rather than limiting it to organisms’ well-calibrated actions and perceptions, as Ingold’s criticism of the Cartesian residue in Gibson still does. Gibson’s mediums, which Ingold’s SPIDER considered just non-agential transmitters of undulations, also exist as folds and become as agential as SPIDER is as a fold. In this light, what looked like mediums’ ‘transmission’ of signals from distanced agents attests, in hindsight, to how these backgrounds of the world are also interested in and responding to one another according to their own supposed matters of concern, so not transmission but ‘transduction’ (Latour, 1999).
So, while SPIDER ‘admits that she would be more inclined to eat others of her kind than to work with them’ (Ingold, 2011, 89), the ANTs unfold networks from the same continuum, showing greater patience in waiting for some entities to emerge as interested in their actions. When SPIDER interprets perturbations of the web as triggers for solipsistic system theoretical autopoiesis, ANTs experience them as manifestations of the world’s microphysical interests. For ANTs, their ecological niche, which Gibson defines as ‘a set of affordances’ (2015, 120), represents the sum of matters interested in their ways of life and consistently responsive to their actions. The ‘fundamental asymmetry’ that Ingold and postphenomenologists posit between living systems’ ability to attune their perception-action coupling and mere reactions of nonliving (Ingold, 2011, 57) is refuted by ANT’s principle of symmetry. According to the latter, there is no mere reaction but only responses, and every response assumes the ontological priority of interests that fold and transduce the world around them.
In case studies by Law (2012), Callon (1986, 2012), and Latour (1984), entities like electrons, winds, scallops, and microbes are studied as these matters of concern. Their eventual transformation into actors and participation in human projects as exploitable facts are discussed not only as a result of humans’ perception of their affordances but also as expressive of their nonhuman perception of affordances from human actions. From this radical relationism of Actor-Network Theory, the perception of affordance – or direct pickup of opportunities for actions in the world – is not something that happens between two self-contained entities. Instead, it is an ontological event where the world is folded into two parts: acting on itself and responding to itself, or interesting and interested. As Kadar and Effken (1994) and Sanders (1997) suggest in their ontological interpretation of the Gibsonian concept, this intra-active view of affordance could ‘have a vital role to play in the understanding of ontology-building all the way down’ to the sub-atomic level of the universe (Sanders, 1997, 110) where everything exists as a pure wave of possibilities, until a certain cut is made by a measuring device whereby something, such as electrons, is perceived by its sensor as invariantly responding to its actuator. 5
On the other hand, extended to macrophysics and social science, this fold diagram of affordance suggests a radical cosmopolitical view that allows us to speculate about the multiplicity of unknown niches, even from the interface of two things that appear to interact seamlessly. This is because, between one’s action and the other’s response, the diagram opens ‘the political scene to accept being inhabited, even haunted, by those who present themselves as not interested in the creation of partial connections’ to others (Stengers, 2018, 94-95). It asks us to acknowledge the presence of unknown matters of concern between sensors and actuators; there are always multiple others withdrawing from registration as they refuse to show any consistent responses to the sensors about the actions executed by the actuators. As Latour (2002) notes regarding the hammer – a classic example of affordance and Heideggerian tool-being – departing from both Normanian instrumentalization and its systems theoretical reduction to opportune perturbation (against which SPIDER operationally closes its organismic system), we can experience things’ affordances as a re-folding of our action-perception along with how the things are interested in and respond to the world. Through ‘this alternation of that folding’, we then re-envision the world as ‘the flux of possibilities’ (Latour, 2002, 250).
Fold diagram as a new critical theory of HCI
This radical cosmopolitics of nonhuman interests is, however, not merely suggestive of other possible worlds that might have been dismissed by our too-habitual incorporation of matters as embodied tools, as ethical theories of HCI are supposed to pay attention to. This is because, on the web after generative AI or PCG-based sandboxes, embodiment – or such a feeling that things are already well-arranged to serve our interests, even before we individuate these interests clearly – is less relevant to our effort of bodily reincorporation of surroundings, which Ingold means by dwelling. If affordances redefined in the previous section mean opportunities for action that the world offers to itself by folding it into itself, the attractors that induce these intra-active niche foldings on the web are no longer our instrumental concerns as tool users. Instead, it is the interest of software capitalism that expects more yet-unindividuated interests of ours to still oscillate between our eyes and fingertips, and constantly refolds the web into as many new niches as possible to flesh out these otherwise unrecognized or underrepresented minority interests.
This section repurposes the fold diagram of affordance developed so far into a critical theory of HCI, describing how the web is currently generated out of real-world matters of concern, such as data collected from human or nonhuman responses, and how AI profits from folding this web around our pre-individuated interests. Beyond this methodological benefit, the diagram’s emphasis on symmetry also suggests that, while some of those interests are caught in well-calibrated couplings between AI-pushed/generated content and user responses – so registered as user preferences in their profiles – some others are always withdrawing from the current sensor-actuator calibration and preserved to be exploited by new calibrations. In this respect, the diagram exposes a vernacular form of cosmopolitics concerning non-represented others, which serves discursive support for software capitalism’s current speculative economy.
Data as folds
Firstly, data points as measurements of real-life events are congealed forms of folds. De Freitas illustrates data folds as ‘infinitesimal or infinitely small intervals’, where ‘an infolding or contraction of the continuous fabric of life’ precedes ‘all individuation and separation into parts’ (2016, 463). In my diagram, these intervals are those between a sensor and an actuator, or the cuts made on the continuum by a datafication tool, whereby something is detected by its sensor as consistently responding to its actuation. In this context, datafication is the measurement of how interested and therefore responsive things are to the tool’s nudging.
Things folded into the intervals are not self-contained entities but other folds that manifest some pre-individuated or nonhuman interests of the world, and this fold-monism holds true even when things under datafication are the world’s physical reactions to digital nudges, as Barad’s agential-realist interpretation of quantum physics suggests (2007). However, more satisfactorily illustrated as this ‘folding of another fold’ is the generation of user profiles from one’s online behaviors. The most common web interface for this is analyzable into multiple intervals between the stimuli it actuates on screen (e.g., posts, videos, and ads) and the buttons or social plugins to sense user responses (e.g., clicking, dragging, and swiping) – in other words, between actuators and sensors. Other folds re-folded into these intervals are then neurophysiological folds between users’ eyes and hands: the intervals between their exposure to algorithmically curated stimuli and motor responses through button actions. Under the embodiment framework, the neural connections within these folds, called sensorimotor arcs, tend to more or less fixate on incorporating tools as their stable extensions. But, under the web’s current behavioral economics, they are required to be kept always flexible and plastic enough to manifest the multiplicity of interests yet un-individualized as consumer preferences.
Algorithms as folds
Algorithms are also diagrammed as the folding of these data folds alongside their encapsulated functions, whereby ‘diverse things’ are ‘brought together to produce different versions of the social and natural orders’ (Lee et al., 2019). Machine learning is how algorithmic models are generated regarding these yet unknown social and natural orders, supposedly embedded in manifold data being infolded between the input and output layers of the machine-learning algorithms. A fold eventually forms between the two layers as a certain equation is chosen, after long input-output calibrations, as the best proxy of the supposed ‘invariant combination of variables’ within the infolding flux (Gibson, 2015, 126). In its simplest form, such as linear or logistic regression, the proxy is a linear or curved graph capable of gathering a larger number of data folds alongside it, while many uninterested others withdraw into the background.
In more advanced artificial neural networks, the input-output calibration is repeated through the process called backpropagation across multiple layers of artificial neurons, each of which is diagrammable as a fold between its receptor for firing from neurons in previous layers and its actuator to fire toward the next layer. Folding artificial intelligence within these manifold neurons – the process called deep learning – is tentatively complete when the collective neural firings in the middle layers, triggered by infolding data, become sufficiently invariant to label each input as a distinct kind at the output layer. Dreyfus (2002) illustrates this folding of intelligence across neurons as the ‘maximal grip’ the network achieves upon an embedded order of the infolding outside, as the AI finds its niches in the world as data folds.
Algorithmic culture as superfold
In this diagrammatization, the current algorithmic culture, which takes (user) data plus (machine learning) algorithms as its fundamental components (Dourish, 2016), is therefore described as the system of permanent recoupling of these fingers and eyes, buttons and content, graphs and edges, and many algorithmic, neurophysiological, habitual, and nonconscious folds in between. The primary interest of software capitalism that runs through this multiply intra-folded system or ‘superfold’ (Galloway, 2012) is to make these folds with heterogeneous materialities more responsive to one another, and thus offer more opportunities for actions to one another. In this respect, the Gibsonian imagery of the web ‘as a whole with its unlimited possibilities [that] existed prior to’ AI’s arrival (Gibson, 2015, 120) promises the sustainability of their business – namely, folding as many niches/intra-faces on the web as possible for yet unactualized pragmatic meanings vis-à-vis the multiplicity of pre-individuated user interests. This new form of capitalism is, as McKenzie Wark (2019) calls it, ‘vectoralist’ in that what it does for this infinite folding business is distribute and move small vectors, such as tiny graphs and edges with varying weights, across the web, and register those people or entities – as data donors, attention givers, subscribers, or mere correlations – interested in and responding to the ways each vector folds the web.
For instance, this new corporate power redistributes the vectors between content and buttons on a web interface by constantly recalibrating its embedded sensors and actuators. Vectors here lie within the interstice between such matters as which content to display where and in what manner on the actuator’s side, and how often, how long, and by whom it is clicked, scrolled, and viewed on the sensor’s side. The goal of this AI-generated interface, which Thaler and Tucker (2013) call a choice engine, is to make the web capable of responding even to our inadvertent clicks by always showing something interesting and relevant through subsequent algorithmic content selection. The vectors it distributes are thus meant to lure us into constantly re-folding our perception and action – our exposures to and clicks on something, or simply put, our choice – around statistical or nonconscious orders that its AI assumes to exist.
On the other hand, during procedural generation of sandboxes or text generation out of LLM, the vectors generative AI redistributes are those among digitized cultural assets, such as vectors across words and objects, that comprise large language models or game engines. These webs of assets are extracted as statistical proxies of human cultures and real worlds, whose semantic structures are analyzed by the AI in terms of how often certain words or things are folded together into texts or real-world landscapes. In this algorithmic re-folding of the real, effectively bracketed are the intentions of human authors’ folding hands and historical formations of those cultural and geological sedimentations. In response to the prompted user intent, the AI then refolds these webs of digital assets into a new niche that appears meaningful even to some disoriented vectors created by our meaningless prompts or wandering around a sandbox without making any choices.
Googling as web folding
One interesting operational example of this niche folding on the web appears in an article that Latour co-authored on a monadology of database search. In ‘The Whole is Always Smaller Than its Parts’, the authors write that a monad ‘is not a part of a whole, but a point of view on all the other entities taken severely and not as a totality’ (Latour et al., 2012, 598, emphasis in original). Their case of database search can be modified into a Googling scenario, and the totality of entities that a monad in this description enfolds is then a web of all webpages, whose relevance to one another was once measured by numbers of inbound hyperlinks but is now woven by tiny web applications called crawlers, the fold-weavers that extract and index information from each page. The search results as an ad hoc webpage that Google Search generates in response to my search query, which is my matter of concern, is a monad that is folded within the web around that matter, for instance, my name. According to Latour et al., what this monad perceives is always the ‘whole world’ just ‘grasped’ from an ‘idiosyncratic point of view’ (599), even though perceived with clarity are only those severed from the rest as relevant to the monad’s interest (e.g., webpages that detail the university where I work or my publications). By enfolding all these responsive webpages to its matter of concern, the monad is individuated into my online profile. However, this does not mean that my workplace is merely a part of the world perceived from this ego-centric concern, because it still has many other aspects to lend to other search interests. More importantly, it could also be a search interest itself around which a new monad, a new search results page, can be folded as other parts of the web (e.g., my profile as its employee) lend themselves to its folding. Regarding this fold-monism called monadology, the authors write: The whole is now much smaller than the sum of its parts. To be part of a whole is no longer to ‘enter into’ a higher entity or to ‘obey’ a dispatcher (no matter if this dispatcher is a corporate body, a sui generis society, or an emergent structure), but for any given monad it is to lend part of itself to other monads without either of them losing their multiple identities (607).
In their description, a monad not only perceives the world as a whole from its unique interest but also lends part of itself to other monads’ perception of the world. Its perception is always of how interested its world is in its action, whereas its actions are expressive of how interested it is in other matters of concern. In this monadological expansion of the fold diagram, a whole is thus neither something at a higher level nor a complex structure emergent from the interactions of simple elements. A whole is, as the authors write following Gibson, ‘what remains constant through the shifting of viewpoints’ (607). In other words, a whole is not an organic sum bigger than its components, but rather a remainder after multiplicities of uninterested others withdraw, which still preserve lots of differently interested wholes to unfold.
How generative AI refolds the web
In the case of generative AIs, such as Gemini’s ‘AI Overview’ of the search query integrated at the top of Google Search that often impedes us from scrolling (or folding?) down to search results, the totality of entities that matter is, on the other hand, a web of digitized cultural assets, which is not connected through relevance as the webs crawlers weave. Here, the web is woven by another kind of fold-weaver called parsers that dissect vast amounts of texts or cultural sedimentations created or inhabited by humans. The underlying semantics of our text cultures are these sedimentations, which postphenomenologists may argue are actually under constant rearrangements through what Merleau-Ponty (1962) calls ‘intentional arcs’ between humans’ writing hands and reading eyes. Writing, according to this embodiment theory, is the process of folding the words or things available around their bodies into each new text, to make novel grips between their embodied minds and the worlds – the inside and outside of the folds. On the other hand, parsing by generative AI is to map these semantics onto a vector space, where a set of vectors that characterizes each word or thing is, to simplify, merely indicative of the probabilities that it follows or is folded together with another word or thing in sampled texts.
In our Googling scenario, the search query, composed of several words, is now turned into a prompt and also mapped onto this vector space as a short combination of vectors across its constituent words. Then, a monad folded around this matter of concern embedded within the web is the Gemini page, where new text is generated word by word as the cursor moves. Algorithmic text generation in this Latourian monadology is, therefore, when the whole web is folded into and refracted along these embedded seed vectors that express the monad’s unique interest and, in turn, those entities interested (with higher probabilities) in its prompt lend themselves to the text in the making. An AI-generated text is each new perception of the web grasped by a monad’s unique point of view. And this web is capable of generating (at least spuriously) sense-making texts in response to any prompts, even to meaningless ones, insofar as the semantic structures of the cultural sedimentations can be parsed as a sort of stochastic manifold, concerning how every two words at each moment of writing are folded into an author’s perception of the first and choice of the second. But the meanings that humans communicate from one word to another are not its concern at all.
Despite its displacement of semantics within the web of probabilities, this way of diagramming generative AIs does not reduce writings at human-computer intra-faces to Skinnerian verbal behaviors responding to pre-defined stimuli. They are rather pragmatistic machines as suggesting that every writing as word-folding is meaningful to the extent that it contributes to the ‘practical success’ of fulfilling its interest (Koopman, 2007, 718): namely, to keep the enacted texts always interesting and graspable to others, thus worth engaging. In this view, what large language models are trained on from every click and type humans input is a pragmatistic re-engineering of semantics, by which the truthfulness of inputs (e.g., words, hashtags, images) becomes ‘a name for the satisfaction of felt interests’ (Koopopman, 2006, 111). Gibson makes clear that what we perceive are not things but their affordances – the world’s responses to our actions. For these pragmatist AIs, the meaning of content is likewise measured as its effectiveness in drawing desired responses from the world.
For instance, during training, the base model of Gemini constantly reworks its probabilistic web from the feedback of another algorithmic model, the preference model, which has been trained on actual responses from human annotators (Lambertt et al., 2022). Throughout ongoing interactions with users, language models can also fine-tune each word’s responsiveness to another based on how satisfied or dissatisfied users are with the words selected and how they are folded into the responses to their queries. And it is needless to say that, on the post-truth Internet, the semantics of human authors are also reworked through similar reinforcement learning from other humans’ favorable responses to (or recommendation algorithms’ preferences of) their word choices.
Re-worlding the web for profit
If affordances are these pragmatic meanings that the world yields by folding itself around its pre-individuated or nonhuman matters of concern, the web’s new affordances that generative AIs promise even to unskilful users relate to its novel mechanism of folding. We examined how the embodiment paradigm of postphenomenology and Ingold can undo the Normanian instrumentalization of affordance in HCI, and how Latour’s ANT-based monadology suggests folding as the process in which a subject-world intra-face is formed within the web imagined to be a sort of plane of immanence. However, even in these discourses, the vectors that fold the world into multiple niches still depend on the tool users’ or skillful search users’ actions and their ability to fine-tune their methods of inquiry according to how the world responds accordingly. Conversely, the vectors that constantly refold the new web are those with varying weights between words in the base model of Gemini, or those between artificial neurons in deep reinforcement learning.
Embodying the creativity of artificial intelligence, these vectors can fold the web into multiple niches that respond to almost every inquiry we have, even to the actions we exercise with no serious intentions but just toying with it, signaling yet-unindividuated interests at our fingertips. While the Latourian thesis ‘the whole is always smaller than its parts’ suggests how inexhaustible the totality of matters of concern in the world is by any niches or taskscapes that living systems fold around their narrow concerns, the current software capitalism emphasizes that the new generative webs can offer unlimited affordances even to unskilful and unintentional user actions. PCG as game engines, in this light, demonstrates how a vector space formed around even very limited digital assets, such as those used in designing simple dungeons in Roguelike games, can be folded, along the seed vectors the gamer actions activate, into an unlimited number of playable digital landscapes. The affordances of these sandboxes are less discovered by the gamer through her gradual embodiment of how the world responds to her conscious goal-oriented actions, but are more relevant to the feeling that every action, whether she takes it consciously or nonconsciously, is always well responded to by the AI and contributes to its process of worlding.
On the other hand, the software industry’s current interest in using AI to generate synthetic data for training other machine-learning models 6 is based on a diagnosis that the ground truth of algorithmic culture has actually been biased due to poor datafication tools unable to interest and draw responses from minorities in gender, race, class, or in less-than-human status. These gaps within the ground truth are what AI-generated synthetic data are to complement. If more search and recommendation algorithms are trained on these synthetic data to represent previously underrepresented minority matters of concern, the web regenerated accordingly would be more like the totality in Latour’s monadology. To any minority groups, the web will unfold each new whole with the maximum graspability to fulfill their unique matters of concern, even if they remain silent to the AI’s request to participate in data collection and machine training.
These webs with multiplicities of yet-unacted intra-active niches are, however, not simply conjured by gamers or developers’ optimistic speculation about the magnanimous universe that responds to any human inquiries. Too good to be true! The feeling of abundant affordances is rather constructed as each new intra-active niche is indeed folded along the vectors that preempt yet-untaken human actions of grasping. So, if the ‘power’ that Compton said we gain through generative AI pertains to the new web’s unlimited affordances or its unlimited foldability-graspability, the ‘control’ we trade for this power is what we used to associate with our ability to make the world interested in our actions. We used to exploit the world’s affordances when we (like Latour’s search user) folded things around our interests, all by our tool-using hands. We may still believe we exploit the web’s affordances since we at least push the machines to get trained on the choices we have made within it. The pragmatic meanings of the web for these tool users are thus doubly articulated, firstly by the habitual and intentional tuning of their choices to how the web responds, then by machine-learning’s extrapolation of these choices. For users of generative AI, whom I elsewhere (Ahn 2023) compared to toy players instead of tool users, this structure of double articulation is redefined as the initial choices are now made by the AIs completing the vectors left unfinished by human actions, and more non-generative AIs are trained on AI-generated synthetic data instead of human inputs.
On the flip side of Compton’s celebration of our engagement in the AI’s creative power by lending our unskilful actions as the seed vectors that trigger AI’s completion, the diagram this paper develops critically re-examines the creative process of worlding, which the embodiment framework in the current new materialism often neutralizes as merely descriptive of the co-creation of the world from human-nonhuman entanglements. Worlding is the expression of the world’s pragmatist interests to make itself useful to itself, by folding many niches within itself. Our satisfaction with the procedurally generated worlds is relevant to the feeling that every tiny action we take is entangled with this worlding at the scale of the Internet. And what Idle Hands, alongside many toy-like generative AIs, celebrates is our decision not to be instrumentalists of the web, to the extent of celebrating this giving-up as the precondition of our being part of the world’s worlding. Meanwhile, what this celebration strategically conceals behind this new media ecosystem is the interest of software capitalism: mobilizing our playful engagements with AI-generated niches to maximize the web’s pragmatic values with more user-AI intra-faces.
Regarding this complicity between the ecological imagery of the web and the corporate interest in its yet unacted niches, the fold diagram developed here provides a critical framework to analyze the latest version of the digital ecosystems owned by tech giants, such as Google. Their evangelism of micro-entrepreneurship is currently upgraded with their AI tools, which expand the opportunity to start a small platform-based online business even to people with no coding and marketing skills, but just a willingness to toy with AI. Affordances of these new ecosystems are no other than their infinite intra-active re-worlding, always capable of unfolding each new niche where the user finds her underdeveloped business interests are always and already responded to by unknown demands and vice versa. My fold diagram can also illustrate this speculative process of the current platform capitalism.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
AI statement
The author acknowledges the use of Gemini (2.5 Flash) for copyediting this manuscript. The AI tool was primarily used for grammar and spell check. The authors have reviewed and approved all changes and take full responsibility for the content of this manuscript.
