12 Tones of London

London's demographics explored by processing 2011 Census data with a statistical sorting method called cluster analysis. Attention is then focused on the sounds of the 12 archetypal council wards emerging from the study.


What does London really sound like?

THE AIM OF this project is to describe variations in sounds across London within the limits of having only enough time to record in a few of the city's innumerable streets, parks and other public places. 12 Tones of London uses a statistical analysis to select 12 out of London's 623 council wards (not counting the City of London) in the hope that their sound profiles can be generalised across relatively large swathes of the capital. It makes central to the investigation demographic factors such as class, ethnicity and age.


The colourful map above shows what happens when you take 2011 Census data from over 40 topics for all London’s 620-odd council wards, and put them through a statistical sorting technique called cluster analysis. The software used was instructed to arrange the wards into 12 clusters by weighing up combinations of topic data.

For example, the red cluster found exclusively in west London is formed from a mix of a large Indian and Other Asian population, a high proportion of semi-detached housing and of land space devoted to domestic gardens, among other factors. The analysis also identified which council ward was the most typical member in each cluster – they're the ones marked by circles.

Some market research companies compile consumer profile databases, like Experian’s Mosaic, which sort households into categories. You can buy the data in map form to find out the proportion of different household types by area. But it costs thousands of pounds to access what is, after all, the work of professional statisticians and market reseachers. A cheaper method is to download Census data, knock it into shape using OpenOffice Calc and buy a fairly inexpensive stats program like NCSS 8.

The Census topics chosen include the average age of each ward’s residents, the percentage of households not owning a car, the percentage of households in which English is the main language, unemployment rate, percentages for different occupational and ethnic groups, population density per hectare, and basic land use data.

Allowing for how some people inexplicably find statistics boring, a brief explanation of what cluster analysis is and the precise method used for 12 Tones of London has been confined to the column on the right.


When the appreciation of sound is treated as an end in itself, syncretic descriptions of how a city sounds seem to follow readily. The pioneering field recordist Ludwig Koch claimed There is an atmosphere in sound that belongs only to Paris.

More recently, the musician David Byrne made some field recordings in London and declared he'd discovered the city converges to a 'common rhythm' of 122.86 beats per minute. Professor Tod Machover, a composer based at MIT's Media Lab, reckons There’s a sound to the city of Edinburgh, and I think of it as like bagpipes meets Beethoven.

Perhaps more research is needed. Such middlebrow observations by themselves are like responses to a journalist asking So, what is the sound of London? It’s an ill-posed question because any aggregate measure along a single dimension, such as London’s average sound frequency being x number of hertz, involves discarding a great deal of information. It's not obvious what understanding such a fact could lead to. Second, differences in what’s sampled and how will produce wildly variable results. You can’t record everything.


One of the goals of the London Sound Survey is to treat sound as a means to an end, that of knowing more about the past and present nature of the city and what it's like to live here. London is now more ethnically varied than at any time in its history with high population churn and rates of income inequality not seen since the early 20th century.

So, despite the homogenising effects of modernity on how public spaces sound, more insights should arise from examining likely patterns of difference around the city than seeking commonalities. These are the assumptions and hypotheses informing the project:

1. There is significant geographic and demographic variation across London. These differences exist at many scales, but the administrative level of council ward represents for the researcher a reasonable trade-off between precision, availability of data, and feasibility of sampling.

2. In public spaces such as streets, parks and elsewhere, council wards will sound different to each other according to the demographics of who lives in them and the geographic features of population density, housing type, or what proportion of a ward is taken up by roads, housing and gardens.

3. There is enough structure in the geographic and demographic differences between council wards to allow them be sorted into clusters according to similarity.

4. Within each cluster, a single ward can be identified as the one which is the least dissimilar to all other cluster members and so be treated as if it were the most representative. What’s heard in that archetypal ward should predict what’s heard in the cluster’s other wards at a level significantly higher than chance.

5. The contents of ward recordings can themselves be quantified and the resulting data subjected to a further cluster analysis. If the desired number of clusters for sound data is set to be the same as that for the analysis of Census data, then there should be an overlap between the two sets of cluster membership at a level significantly higher than chance.

Over the next couple of years it's hoped that each ward will end up being represented by several dozen recordings, with attempts to balance time of day, day of the week and annual season between them. You'll see these appear as dated batches displayed on sound maps for each of the 12 wards.

Eventually it should be possible to put on a more rigorous footing the sense that different sorts of neighbourhood, be they rich or poor, suburban or inner city, must sound distinct to one other in varied yet often predictable ways.

Highbury Park

Highbury Park near junction with Aubert Road (Cluster 7).

Warwick Road

Warwick Road in Enfield (Cluster 2).

Abingdon Villas

Abingdon Villas in Kensington (Cluster 8).


Cluster analysis is comprised of several related statistical sorting techniques. It can explore patterns in large data sets, particularly where there are no prior assumptions about classification. It has uses in areas such as the social sciences, medical and market research.

Cluster analysis attempts to arrange data objects into groups or clusters according to similarity. The assumption is that the objects in a cluster will be similar to one other and different from the objects in other clusters. The greater the similarity or homogeneity within a cluster, and the greater the difference between clusters, the better or more distinct the clustering.

The cluster analysis technique used here is called partitioning around medoids. The medoid is that cluster member which is the least far away on average from its fellow members along the dimensions in the data set. Imagine the medoid as something like the sun in a solar system governed by the force of similarity rather than gravity. Other wards are arranged around it like planets at varying distances of similarity.

An alternative way of expressing within-cluster similarity and between-cluster dissimilarity is through a single measure called a silhouette statistic. Sometimes the medoid is the ward with the highest silhouette value in the cluster, and sometimes it isn't. But, like the mean and median of a set of values, the two aren't independent of one another.

Silhouettes are convenient as a quick and easy way for displaying which wards fit well into their clusters and which don't. On the cluster maps a rough idea of a ward's silhouette value is shown by how rich and saturated its colour is. Less typical wards are shown as progressively more washed-out. Some wards have such low silhouette values that I have decided to remove them altogether from their clusters. They're coloured dark grey on the map and treated as uncategorised.

The analysis cannot by itself give a definitive answer as to how many clusters the data set should be partitioned into. In this instance, the software was instructed to produce 12 clusters as a trade-off between having a manageable number of wards to record in, but not so few that each cluster would be too heterogenous to be meaningful. Also, 12 is an attractive number, as in Jonathan Prior's 12 Gates to the City. Whether the right balance has been struck remains to be seen.