Abstract
Graffiti art is a controversial art form, and as such there has been little empirical work assessing its aesthetic value. A recent study examined image statistical properties of text-based artwork and revealed that images of text contain less global structure relative to fine detail compared to artworks. However, previous research did not include graffiti tags or murals, which reside in the space between text and visual art. The current study investigated the image statistical properties and attractiveness of graffiti relative to other text-based and pictorial art forms, focusing additionally on the role of expertise. A series of images (N = 140; graffiti, text and paintings) were presented to a group of observers with varying degrees of art interest and expertise (N = 169). Findings revealed that image statistics predicted attractiveness ratings to images, and that biases against graffiti art are less salient in an expert sample.
Introduction
Graffiti, derived from the Italian word ‘to scribble’, has its roots in ancient forms of communication and expression, having been found in ancient Greece, Rome, and Egypt, and most notably at the site of Pompeii (Baird & Taylor, 2010). While graffiti has been the target of a substantial amount of sociological analysis (Snyder, 2011), very little attention has been payed to formal and aesthetic properties of graffiti, in contrast to calligraphic art, in which the role of visual elements such as balance, prototypicality and low-level image statistics have previously been investigated (Fillinger & Hübner, 2018; Gershoni & Hochstein, 2011; Melmer et al., 2013). This is somewhat surprising, as there is strong emphasis in modern graffiti on aspects of visual style (particularly the dynamics and flow of the movements underlying the creation of a graffiti artwork) and how these underpin composition and the appearance of expertise (Arte, 2015; Schacter, 2016). The reason for a lack of focus on graffiti as an aesthetic object may be due in part to the sociological context within which graffiti is placed. Graffiti is created not in the white cube of the gallery, but on city streets, often deliberately placed in locations which are difficult or dangerous to reach. The creation of graffiti is considered an act of criminal damage in the UK, and graffiti tags and murals may be taken down as quickly as they are thrown up, making the stimulus a transient one. Nevertheless, there is increasing public acceptance and interest in graffiti and street art, with pieces by street artists such as Banksy fetching hundreds of thousands of pounds at auction, 1 a $6.7 million lawsuit involving destroyed graffiti murals in New York, 2 and graffiti-like designs frequently co-opted by fashion brands. 3
In addition to its sociological purpose, graffiti art also has clear aesthetic aims. In an exploration of the development of graffiti style, Arte (2015) states that graffiti writers are evaluated in terms of the uniqueness, recognizability and aesthetic quality of their work. Graffiti writers play with text in the same way as calligraphy artists, in which letters are ‘modified, extended, looped, shortened, thickened … while correct orthography is frequently violated for the sake of composition’ (p. 106, Grabar, 1992). Text is sampled from and edited, bringing to mind the compositional methods used in the creation of hip-hop music, which developed over the same time period and geographical location as modern graffiti (Arte, 2015). Both graffiti and calligraphy writers select, amend and adorn text keeping in mind ‘the crucial elements of unity, proportion, scale, contrast, balance and rhythm’ (p. 144, Schacter, 2016). Here, the traditional tenets of empirical aesthetics, encapsulated in the theories of Daniel Berlyne (1974) and Rudolf Arnheim (1965), are considered as important in the generation of graffiti art as in any other artistic medium. It follows then that in order to fully characterise graffiti as an art form it is necessary to consider both sociological but also perceptual accounts of its generation and reception.
In the current study we focus on two categories of graffiti art: tags and murals. Graffiti tags are usually based on letters of the alphabet and are considered the signature of a graffiti writer. Graffiti murals on the other hand are more complex artworks, often incorporating pictorial elements and three-dimensional rendering of text forms. Through the process of editing and abstracting previously mentioned, each graffiti tag becomes a highly stylized and individualised visual form. An essential characteristic of graffiti, and the tag in particular, is the mastering of very rapid and fluid drawing movements (Berio et al., 2017; Berio & Leymarie, 2015; Wacławek, 2011). This in turn influences the tag’s aesthetics and style, also referred to as the hand-style. The role of visual properties of a tag or mural in determining preference for it are yet to be tested empirically. In other work, we are exploring the role of fluency of movements in preference for graffiti tags. In the current study we focus on visual features such as complexity and self-similarity, to integrate research on the aesthetics of graffiti with more traditional pictorial and text based art forms.

Mean Attractiveness Ratings for Image Categories. Error bars represent ± 1 standard error of the mean. GrafMur = Graffiti Mural; GrafTag = Graffiti Tag; LetCal = Calligraphy; LetIni = Initium; LetOrn = Ornate letter; PaiAbs = Abstract painting; PaiRep = Representational painting.
The aim of the current study was to explore the image statistical properties of graffiti tags and murals in relation to other text-based artwork and visual artwork, and to link differences in image statistics with observers’ attractiveness ratings. Image statistics were explored using Pyramid Histograms of Orientation Gradients (PHOG); a shape descriptor which characterises an image by its local shape and spatial layout at different spatial scales. From PHOG we were able to produce measures of self-similarity, complexity and anisotropy for each image. Previous research on image statistical properties of artworks are suggestive of ways in which graffiti may be more or less visually pleasing to observers. Melmer et al. (2013) conducted a study in which a large sample of images were submitted for image statistical analysis, including text-based artworks (ornamental writing and calligraphy) and regular text (that which is used for writing word documents such as the one you are reading here). The authors analysed the Fourier power spectrum (to detect if it displays scale-invariant properties possessed by artworks and natural scenes) and derived a measure of anisotropy (how distributed the orientations of the contrast gradients in the image are). They discovered that images of regular text and handwriting possessed less global image structure in comparison with artworks, reflecting a lower degree of compositional structure. However, with a higher level of artistic claim, text-based images (like ornamental text and calligraphy) display image statistical properties that more closely resemble those of visual artworks. As graffiti and calligraphy share a common aetiology, and use many of the same techniques for amending and adorning letter-forms, it is likely that their image statistical properties will be similar.
Due to the sociological constraints of graffiti as an artistic medium previously discussed, it is possible that an aesthetic bias exists against graffiti art (in comparison to more socially acceptable forms of text-based art such as calligraphy). This bias may override lower-level determinants of attractiveness ratings, such as image statistical properties. It is the additional aim of the current study to explore this potential graffiti art bias, and to explore whether expertise modulates observer responses to graffiti art. Previous research suggests that biases against computer-generated art are not overridden by expertise (Chamberlain et al., 2018), and therefore it is unclear whether expertise will override a graffiti bias. However, expertise effects on preference for abstract/representational art are well-known and appear to be reliable (e.g. van Paasschen et al., 2015) and therefore are anticipated in the current study. In summary, the current study takes into account low-level (image statistical) and high-level (expertise and the social status of graffiti as a criminal act) factors and assesses how they shape observer responses to graffiti art.
Method
Stimuli
Stimuli were digitized images of Graffiti tags (n = 20), Graffiti murals (n = 20), initiums (n = 20), Ornate text (n = 20), Calligraphy (n = 20), Abstract paintings (n = 20), and Representational paintings (n = 20). Calligraphy and initiums were chosen as examples of text-based artwork against which we could compare attractiveness ratings to graffiti, while ornate text was selected as a baseline example of aesthetic text which did not contain the global compositional structures of the initiums or the calligraphy. The graffiti images, calligraphy and initiums were collected from online sources by DB, CM and RC, whilst the ornate text and painting categories were selected from a larger set used in previous studies on image statistics of text-based and fine art (Melmer et al., 2013; Redies et al., 2007).
Participants
A total of 169 participants completed the study (77 female; Mage = 33.75; SD = 13.14). In a demographic questionnaire participants were asked, ‘Do you consider yourself to be a practicing artist or designer?’. If participants responded ‘yes’ to this question they were asked to indicate their artistic specialism from a pre-specified list, including an ‘other’ category. In addition, participants were asked about their interest in art (on a 1–7 point Likert Scale), how often they visited a gallery (Never; Once; 2–3 times per year; Often), and how many art books they owned (None; 1–5; 5–10; 10+), Twenty-nine participants reported that they were practicing artists with experience ranging from 2–35 years, in a range of disciplines, most commonly drawing, fine art, graffiti, photography, and film. This group constituted our expert group in subsequent analysis. Participants reported a wide range of nationalities (Asian = 82; European = 75; Australian = 1; Northern American = 9; South American = 3).
Procedure
The study was conducted online, through a specifically designed platform used by the GestaltReVision research group at KU Leuven. Participants were recruited through mailing lists and social media platforms (e.g. facebook, twitter) with a particular focus on recruitment from groups with specialism in the arts. Participant was voluntary and participants were no compensated for their time. Participants were informed about the ethical guidelines of the study prior to providing informed consent. Participants were informed that they would view a series of artworks onscreen and would be asked to rate how visually pleasing they found each artwork on a Likert scale (1 = unattractive; 7 = very attractive). They were provided with a set of example images with one image from each category (graffiti tags, graffiti murals, ornate letters, calligraphy, initiums, abstract paintings, representational paintings) to obtain an overview of the stimulus set before viewing each individual stimulus. Trials (N = 140) were then fully randomized within and between image categories. Once 140 images had been rated, participants were presented with an open field response question “Please indicate in the space below if you feel you used any visual criteria in rating the images”.
Ethics
The study was approved by the Ethical Committee of the Laboratory of Experimental Psychology, KU Leuven.
Results
Aesthetics of Image Types
Stimuli were split into categories (graffiti tags, graffiti murals, ornate letters, calligraphy, initiums, abstract paintings, representational paintings) for analysis. Table 1 and Figure 1 show the descriptive statistics of attractiveness ratings for each stimulus type.
Descriptive Statistics for Attractiveness Ratings of Image Categories for All Participants, and for Expert and Non-Expert Subgroups.

Attractiveness Ratings for Image Categories by Expert Group. Error bars represent ± 1 standard error of the mean. GrafMur = Graffiti Mural; GrafTag = Graffiti Tag; LetCal = Calligraphy; LetIni = Initium; LetOrn = Ornate letter; PaiAbs = Abstract painting; PaiRep = Representational painting.
When inspecting attractiveness ratings for the image types a complex pattern emerged. Participants preferred representational paintings to abstract paintings in the painting group, and abstract paintings received similar mean attractiveness ratings to initiums and calligraphy. Ornate text on the other hand appeared to be least preferred, compared to all stimuli including both graffiti forms. Statistical analysis confirmed that there were significant differences in attractiveness ratings given to each of the narrow image categories, F (6,990) = 65.25, p < .001, ηp2 = 0.32. Planned post-hoc t-tests (Supplemental Table 1) with correction for multiple comparisons (p = 0.05/21 = 0.002) revealed that there were significant differences between all stimulus categories, except between graffiti murals and graffiti tags, between calligraphy and abstract paintings, and between initiums and abstract paintings (Figure 2).
Attractiveness Ratings of Image Types by Expertise
Mean attractiveness ratings were derived for both the image categories for the practicing expert and non-expert participant groups (Table 1).
On inspection of Figure 2 it can be seen that the expert group show little difference in their ratings of abstract and representational paintings, where the non-expert group show a preference for representational work. In addition, experts show a stronger preference for calligraphy, graffiti tags and graffiti murals, compared to non-experts. A 2 × 7 (group by image type) ANOVA revealed a significant main effect of image category, F (6, 984) = 66.32, p < .001, ηp2 = 0.29, no significant main effect of group, F (1, 164) = 1.15, p = 0.29, ηp2 = 0.01, and a significant interaction between group and image category, F (6, 984) = 3.71, p < 0.01, ηp2 = 0.02. Simple effects analysis of comparisons by group for each level of image type revealed that there were numerical differences between experts and non-expert in their ratings for graffiti tags and calligraphy (experts provided higher attractiveness ratings for these image categories), but these did not survive correction for multiple comparisons (Table 2).

Image Statistics (Anisotropy, Complexity and Self-Similarity) Across Image Categories. Error bars represent ± 1 standard error of the mean. GrafMur = Graffiti Mural; GrafTag = Graffiti Tag; LetCal = Calligraphy; LetIni = Initium; LetOrn = Ornate letter; PaiAbs = Abstract painting; PaiRep = Representational painting.
Simple Effects Analysis of Group Differences for Each Image Category.
Notes: *p <.05 (uncorrected p-value).
Correlation Between Art Interest and Engagement and Attractiveness Ratings for Each Image Category.
Notes: *p <.05 (uncorrected p-value); **p <.002 (Bonferroni-corrected p-value).
Image Statistics (Anisotropy, Complexity and Self-Similarity) Across Image Categories.
Correlations Between Attractiveness Ratings and Image Statistics for the Image Categories.
Notes: *p < .05 (uncorrected p-value).
Correlation of Attractiveness Ratings With Art Interest and Engagement
When looking at the attractiveness ratings in relation to the art engagement of the sample as a whole, similar patterns emerge to the analysis of expertise differences (Table 3). After correcting for multiple comparisons there was a significant correlation between the extent to which participants reported to be interested in art and their attractiveness ratings across stimuli, as well as specifically for graffiti tags, calligraphy and abstract art. There was a less robust link between ownership of art books and visits to art galleries and attractiveness ratings. The only correlations surviving correct in this instances were for ratings of abstract paintings.
Image Statistics
Image statistics were derived using PHOG to produce measures of self-similarity, complexity and anisotropy. These image statistical measures have previously been used to study the aesthetic properties of artworks (Redies et al., 2012). To calculate PHOG, HOG image features are calculated at multiple levels, representing increasingly finer degrees of details. First HOG features are calculated at the global level (the whole image; Level 0), then at 4 equally-sized subcomponents of the image (Level 1), then at 16 subcomponents of the image (Level 2) constituted of 4 equally-sized subcomponents of each component of Level 1. This procedure of division and HOG calculation can be executed as many times as the resolution of the image permits. Individual gradient orientations are binned and the normalized values of each bin represents the strength of gradient orientation in each direction. From the PHOG, three classes of image statistic can be derived:
Self-similarity: The HOG features of each sub-image at Level 0 (the whole image) are compared with HOG features at Level 3. A high degree of similarity between the two indicates higher self-similarity in the image. Complexity: The mean norm of the gradient across all orientations is calculated. Image gradients represent changes in lightness in an image. Therefore, the higher the mean gradient is, the more complex the image is. Anisotropy: The variance of gradient strength in the HOG features across all bin entries at Level 3. A higher value represents higher anisotropy (a larger variation in gradient orientations in the image).
Role of Image Statistics in Attractiveness Ratings
PHOG derived image statistics (self-similarity, complexity, anisotropy) were analysed across image categories (see Table 4).
From Figure 3 it can be seen that there were differences in image statistic values across the image categories. The two graffiti image categories (tags and murals) and the two painting image categories (abstract and representational) appeared to have broadly similar image statistical properties. On the other hand, the text-based images (ornate, initiums and calligraphy), showed differences across the different image statistical measures.
Self-Similarity
There was a significant difference in self-similarity across the image types, F (6,133) = 32.08, p < .001, ηp2 = 0.59. Planned post-hoc t-tests (Supplemental Table 2) broadly reveal differences between the text-based images (with graffiti murals and initiums having higher self-similarity than, graffiti tags and calligraphy) but similar values for ornate text and paintings, which possessed higher self-similarity than the other text-based artworks (Figure 3; top left).
Complexity
There were significant differences in complexity across the image types, F (6,133) = 18.82, p < .001, ηp2 = 0.46. Planned post-hoc t-tests (Supplemental Table 3), revealed that the graffiti murals and tags, calligraphy, and abstract and representational paintings had significantly lower complexity in comparison with initiums. Calligraphy and representational paintings also had significantly lower complexity than ornate text images. All other comparisons were not significant after correction (Figure 3; top right).
Anisotropy
There were significant differences in anisotropy across the image types, F (6,133) = 29.42, p < .001, ηp2 = 0.57. Post-hoc comparisons (Supplemental Table 4) revealed that paintings (abstract and representational) were significantly less anisotropic (i.e. they had lower variation in gradient orientations) than all other image types except graffiti tags and murals. Furthermore, graffiti tags and murals had significantly less anisotropy than calligraphy (Figure 3; bottom).
Correlations Between Image Statistics and Attractiveness Ratings
Correlations were performed between image statistic for each image type and participants’ attractiveness ratings (see Table 5). It can be seen that the patterns of correlations vary by stimulus type. The strongest correlations were between preference and degree of self-similarity and complexity in the calligraphy images (participants favoured less complexity and less self-similarity), and between preference and complexity in the abstract paintings (participants favoured more complexity), however no correlations reached significance after correction for multiple comparisons. Strikingly, correlations between attractiveness ratings and image statistics were very different for graffiti stimuli and calligraphy images, with preference for graffiti images showing the opposite direction of effect for all three image statistic types.
Analysis of Free Response Data
The free-responses (n = 89) provided at the end of the study were qualitatively analysed. Key phrases and words were extracted from the free responses and then placed into themes and subthemes. The overarching themes that observers used to inform their categorisation decisions were: colour, content, production, and structure. The number of times each keyword was used per participant was calculated and the data are presented in Supplemental Table 5. The most common justification for attractiveness ratings was use of colour, particularly those stimuli that looked colourful, or used saturated colour palettes. Participants also expressed a preference for realism, or the presence of identifiable objects. Participants found unattractive those images that contained text and monochromatic images. Several participants made specific reference to their dislike for graffiti art. Interestingly, participants made a number of references to identifiable means of production (via brushstrokes and dripping paint) and to skill or effort, the latter of which they equated less to graffiti art.
Discussion
The current study aimed to characterize attractiveness ratings to graffiti art, taking into account both low-level (image statistics) and high-level factors (expertise and social status of graffiti as a criminal act). The findings revealed an apparent bias against graffiti art, with ratings lower than other text-based art (initiums and calligraphy) and artworks (abstract and representational paintings), although graffiti art was preferred to images of ornate text. However individual differences were revealed based on interest in art, with participants that were more interested in art showing a stronger preference for graffiti tags compared with participants with less interest in art, but also showing a stronger preference for calligraphy and abstract art.
Differences in image statistical properties were found between the image categories. Broadly speaking the images from the painting categories possessed similar properties, with relatively high self-similarity, and low complexity and anisotropy, which is line with previous research (Braun et al., 2013; Redies et al., 2007, 2012). The images from the graffiti categories (murals and tags) also possessed similar image statistical properties, lying in the middle of the range of values of the whole image set for self-similarity, complexity and anisotropy. However, the text-based image categories behaved quite differently to one another, with calligraphy in particular showing relatively low levels of complexity and high anisotropy (high distribution of oriented gradients), in line with the findings of Melmer et al. (2013). Interestingly, despite historical associations between the development and style of calligraphy and graffiti, the two types of images had very different image statistical properties. Graffiti images (tags and murals) were significantly more complex and less anisotropic than calligraphy.
Finally, correlations between image statistics and preference for individual images, revealed differing patterns of correlations dependent on image type. While somewhat limited by low statistical power (only 20 stimuli in each category), results suggest that image statistics drive preference for text-based stimuli differently to graffiti stimuli. Text-based images that were liked the most were low in self-similarity and complexity, and high in anisotropy. By contrast, graffiti images that were preferred were high in self-similarity and complexity, and low in anisotropy, a pattern which is partially reflected in the correlations with painted artworks. For example, a negative correlation between anisotropy and preference was found both for graffiti and artworks, but is reversed in the case of text. This suggests that artworks and graffiti with clear orientations (possibly vertical and horizontal elements) are rated as more attractive, where text with more distributed orientations are preferred.
To summarise, the results demonstrate an aesthetic bias against graffiti, yet image statistical properties suggest that graffiti has low-level properties more similar to that of non-text-based artworks, than its text-based counterparts such as calligraphy and initiums. In this way, graffiti art has similar image features to conventional artworks, and these features may drive preference in the same way as these artworks, but attractiveness ratings are likely hampered by the cultural and social associations of graffiti art. However, this result should be interpreted with caution due to the aforementioned low sample size of the image stimuli leading to somewhat unreliable statistical effects. Melmer et al. (2013) suggest that, ‘with increasing artistic claim, images of text acquire specific statistical properties that are similar to those of visual art’ (p. 13). If we interpret ‘increasing artistic claim’ as those images that stimulate a more positive aesthetic response, the attractiveness ratings of our observers do not support this claim. The graffiti category of images was the least liked by participants (aside from ornate text), and yet these stimuli possessed statistical properties that more closely resembled that of painted artworks than text-based calligraphy and initiums.
Subtle effects of expertise were found in the current study. We numerically replicated the effect of expertise on preference for representational and abstract artworks (van Paasschen et al., 2015). Artists showed no difference between abstract and representational artworks, where non-artists showed a preference for representational artworks. We also found that artists preferred both calligraphic works and graffiti tags compared with non-artists. Thus, there is some evidence that expertise overrides a bias against graffiti art, but this effect is modest in size, likely due to the heterogeneity of the artist sample. Future research specifically measuring the aesthetic experience of graffiti writers would be worthwhile to further explore expertise effects. The current research also does not consider the context in which these different artworks are made. A previous study showed interactions between expertise and preference for graffiti art and modern art, based on whether they appeared in congruent or incongruent contexts (Gartus et al., 2015). The effect of context on appreciation of formal properties of graffiti would be a worthwhile avenue of exploration.
The current study takes an initial step to investigate the formal qualities that may drive appreciation for particular works of graffiti art, by using image statistical analysis. However, it would be valuable in future work to explore mid-level visual properties explored in the introduction, namely qualities like balance and rhythm. Semantic differential scales may also prove useful in investigating the semantic associations of different letter forms. In recent research exploring the evolution of particular styles of graffiti (Arte, 2015), more curved graffiti styles such as bubble letters are described as ‘groovy, organic, interlaced or funky’ (p.78) while wild-style forms are characterized by being ‘machine-like and mechanical’ (p.82). It is unclear how these visual elements (particularly in the case of more complex mural designs), have an impact on the aesthetic evaluation of these stimuli. Moreover, it would be of great interest to understand how graffiti writers transform letter-based stimuli into images. Is there a characterizable process of transformation in the development of a graffiti tag or mural that preserves certain aspects of original letterform while forgoing others? Answering questions about the formal properties of graffiti and how they interact with aesthetic impressions could have wider ramifications for our understanding and the development of all forms of text-based design.
Supplemental Material
sj-pdf-1-art-10.1177_0276237420951415 - Supplemental material for Aesthetics of Graffiti: Comparison to Text-Based and Pictorial Artforms
Supplemental material, sj-pdf-1-art-10.1177_0276237420951415 for Aesthetics of Graffiti: Comparison to Text-Based and Pictorial Artforms by Rebecca Chamberlain, Caitlin Mullin, Daniel Berio, Frederic Fol Leymarie and Johan Wagemans in Empirical Studies of the Arts
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Fonds Wetenschappelijk Onderzoek.
Supplemental material
Supplemental material for this article is available online.
Notes
Author Biographies
). He is a member of the Member of the Royal Flemish Academy of Belgium for Science and the Arts.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
