Deterritorialized Ocularcentrism: AI-Generated Imagery'S Haptic Potential

Submission 94

D1_TPoster-13

Presented by: Mingyang Li

Mingyang Li ^1*, Yanlei Li ^2*

¹ 华南农业大学珠江学院

² 香港浸会大学

In February 2024, the American AI research company OpenAI released Sora, its first text-to-video model, sending immediate shockwaves through the traditional film and television industries. Industry insiders feared that the technology could render conventional image-making techniques obsolete—little more than the "dragon-slaying skills" in a world of generative imagery. Ordinary audiences, however, greeted the tool not with anxiety but with Dionysian exuberance; cheap, frictionless access to image production has democratized visual creation. Behind this euphoria lies a deeper epistemic shift: AI-generated imagery deterritorializes the ocularcentric regime that has dominated moving-image art since the inception of the medium.

In the classical cinematic apparatus, a fixed viewing distance constructs an ontological gap between spectator and image; the audience is positioned as a passive, silent subject, consuming pre-programmed optical flows from afar. The apparatus thereby enacts an ideological capture that Laura Mulvey once termed "to-be-looked-at-ness." Recognizing this asymmetry, scholars have turned to Merleau-Pontian phenomenology to restore corporeal agency. Vivian Sobchack's "film's body" reconceives the cinematic apparatus as an embodied interlocutor that both sees and is seen, while Laura U. Marks's "haptic visuality" argues that certain images solicit a proximal, almost tactile gaze that scans surface rather than plunging into depth. Extending the paradigm to digital media, Mark Hansen demonstrates how virtual-reality environments enlist the viewer's body-schema, eliciting involuntary motor responses that blur the boundary between kinesthesis and touch. Yet these accounts remain phantasmatic: the "touch" they describe is either metaphorical or vision-mediated, and the darkened room of cinema, or VR headset, still subordinates the body to a luminous projection. Crucially, they treat touch as a unilateral sensation—something the recipient "has"—and overlook the bilateral metamorphosis produced by genuine contact, in which both image and observer are altered by the forces exchanged.

AI-generated imaging technology offers a material escape route from this impasse. First, the user operates in an illuminated, unconfined space; no headset or black box encloses the body. Second, the act of prompting, refining, and re-generating images activates real-time changes in proprioception, micro-gesture, and retinal stress—measurable somatic feedback that exceeds purely visual address. Most importantly, the image itself is subjected to force: latent vectors are redirected, pixels are re-weighted, and attention maps are re-calibrated under the affective pressure of the user’s prompt. Image and observer enter a reciprocal modulation that satisfies the Deleuzian criterion of “becoming” rather than mere “viewing.”

To theorize this emergent tactile economy, we must move beyond the residual visualism of Merleau-Ponty and enlist a conceptual lens that privileges mobility, multiplicity, and surface-to-surface contact. Gilles Deleuze’s figure of the nomad provides such a lens. AI-generated imagery functions as a weapon of nomadic warfare: it converts sedentary spectators into data-navigating nomads, deterritorializing the striated, distance-based space of traditional projection into a smooth, data-intensive space centered on haptic relay. Within this smooth space, vision is no longer the sovereign sense; it is one vector among others in a field of tactile, affective, and algorithmic forces. The result is not the representation of touch but its actualization—an encounter in which both image and body are rewritten by the event of contact.