Tuesday, July 12 • 10:31 - 12:00
Embedded Metadata and the Circulation of Images: Tracking, Storing and Stripping

Location: PSH (Professor Stuart Hall Building) - LG01, 
Goldsmiths, University of London, Building 2
Contributor: Nathalie Casemajor, Université du Québec en Outaouais, Canada


Metadata is a set of descriptive, technical and administrative information that plays a key role in image storage, processing and circulation. Recent literature in media studies highlights the roles metadata plays in domains such as digital economies and informational infrastructures. Many of these studies focus on the music sector: in particular, they tackle the automated or manual classification of songs, as well as the algorithmic systems of recommendation (Beer, 2013; Morris, 2012, 2015). In the field of visual culture, various publications emphasize the role that metadata plays in the classification of images by amateur and professional photographers (Van Dijck, 2010; Boullier and Crépel, 2013). But few broach the distinction between embedded metadata and platform-specific metadata. The former refers to data directly stored within the image file, which allows the data to travel with the picture on its journey across platforms, whereas the latter refers to data separately stored on proprietary web servers, including keywords, geotags and other folksonomies, which are lost when the picture is copied from one platform to another.


This paper focuses on image metadata (in particular, photographic images) to illustrate how web platforms handle images, and how these technical choices are tied to different economic models of content and audience retention. Its aim is to challenge the assumption that the more an image circulates and is appropriated on social media, the more metadata it subsequently accumulates. This paper suggests instead that there is a critical distinction between the way embedded and platform-specific metadata accumulate.


This study is based on a set of experiments conducted on a small corpus of photographs. In collaboration with a Canadian visual artist, five images of artworks were marked with embedded metadata and steganographic information before being posted on three different platforms (Wordpress, Facebook and Instagram). Six months later, all the copies in circulation were collected through Google Image reverse search and TinEye, and their metadata were extracted via an application named Exiftool. A quantitative and qualitative analysis was conducted on the metadata to compare 1) how the transit through each platform affected the embedded information and 2) what kind of platform-specific metadata was attached to these images. A complementary analysis was conducted on Flickr and Twitter with a random set of images.


The preliminary results suggest that contrary to platform-specific metadata that stably accumulate on web servers, embedded metadata is shaped by a complex dynamic of accumulation and degradation. On the one hand, social media platforms tend to strip embedded metadata out of their users’ images (this is especially the case with platforms designed for non-professional image-sharing practices, such as Facebook), while on the other hand, social media platforms encourage users to recreate this data in a proprietary format tied to the platform. Therefore, the more an image circulates beyond the thresholds of proprietary platforms, the more its metadata becomes degraded. The images that cross boundaries between platforms and travel through various social media datascapes are the most portable (see Sterne, 2006), but the quality of their metadata is poorer. In terms of audience retention, metadata stripping increases content captivity, as it makes it more difficult for users to move their archives from one platform to another, knowing that metadata (re)creation is a time- consuming operation.

Future Work:

This paper argues that paying thorough attention to the specificities of image metadata lays the groundwork for an understanding of the broader ecology of social media. Further work on larger datasets could foster insights regarding the power and economic dynamics of data streams within and across social media (Manovich, 2012; Hochman, 2014), all the while reflecting on the politics of web platforms (Gillespie, 2010; Helmond, 2015).


