After writing up Art as Data as Art, I kept digging into the Tate metadata - there’s a lot of interesting stuff there, and that means a lot of pretty pictures.
Art as averages One subcategory of ‘interesting stuff’ is the base URLs for each piece of artwork. A lot of the pieces have images on the Tate website(s) - high, medium, low and thumbnail resolution pictures of the painting or sculpture itself. The format for the files goes “[artwork-specific base URL]_[size value].jpg” - in other words, once you have the base URLs, you can retrieve the high-resolution images.
Images in electronic formats are ultimately just numbers - a JPEG, for example, is just length by width by 3, with the 3 being cells for the red, blue and green colour values - and numbers are ultimately vulnerable to computation. As a result, it was trivial for me to average out the RGB values of a set of pieces, generating a single, composite image. I divided the dataset by artistic movement and artist and did precisely that: some of the results are quite pretty.
In order, those are averaged paintings by/from:
Each one is composed of between 2 and 5 works by the artist/from the movement (5 wherever possible). As well as being aesthetically pleasing, these “averaged” works also let us see what areas of the canvass particular artists focus on.
The above image is an averaged work by Vincent van Gogh, comprised of the 4 of his works that the Tate has in their collection. Based on what elements of the paintings are distinguishable it becomes pretty clear that he tended away from the top left corner of the canvass. Other examples of averaged images can be found in the github repository (as can everything else). Some of them are obviously composites, some of them are obviously terrible, and some of them are incredibly trippy.
Art as subjects
In the previous post I looked at the gender breakdown in the Tate collection. It was mostly really basic stuff - how many female artists are represented? What percentage of artists is that, over time? - but the metadata also includes broken-down descriptions of each piece of artwork, which allows me to ask a really interesting question: what does the gender breakdown look like when you examine not the artist, but the subject?
Using a series of regular expressions (hunting for things like “man”, “woman”, “wife”, “son”, so on and so forth), I broke down the art by decade and artist gender, and looked at whether it represented men, women or both.
When we look at all art, regardless of the gender of the artist, there’s a pretty big disconnect between men and women, namely that women are far less likely to be the subject of a piece of artwork than men. Weirdly, the percentage of works involving female subjects is still higher than the percentage of works by female artists, which means that despite the disconnect women are more likely to be painted than painting.
There are a lot of odd null-points in the graphic for art by female artists (ggplot2 does not like zero values), but the picture is pretty clear: with both male and female artists, women are usually less likely to be represented in art than men. With female artists, though, it tends to zig-zag up and down (either there’s more variety in the artwork or it’s just a far smaller sample size), while there’s a fairly constant gender buffer for male artists.
What’s also interesting is using the data to understand the place that humans have in artwork. For most of the 19th and 20th centuries, for example, we saw a consistent decrease in art with humans as the subject - presumably, artists were more interested in environmental or abstract works. Despite the stereotypes associated with modern art, representation of people actually got a substantial boost from the 1970s onwards.
This is probably the last work I’ll do in this dataset for a while - unless, of course, people have interesting suggestions :). As always, they can be directed to scire.facias@gmail.com
Text licensed under the Creative Commons Attribution Share-Alike (CC-BY-SA) 3.0 license. Images dual-licensed under CC-BY-SA 3.0 and the MIT license. Code licensed under the MIT license