Rabia Butt, one of our summer Q-Step interns, explains the process she went through to calculate and map the Carstairs deprivation index using UK Census data.
After downloading the data needed for the research to measure deprivation, some issues were recognised. Therefore, we needed to make some changes.
When trying to match variables with those that were used by published research papers, it was quite a challenge. This is because even though the research papers used Carstairs methods where by definition the census variables needed to refer to households. However, the social class had to be changed from head of household in a lower social class to per person because the data from Scotland did not include household references for social class.
To make comparisons across the UK, the per person variable was employed and to replicate our calculations with the research papers as well. There was no other research paper that calculated the whole of the UK. Most research papers used Scotland census only and some were of England and Wales together but not with the other countries in the UK.
For our investigation, the calculations for the Carstairs scores were performed in R, which required the import of a dataset with the relevant variables for the calculation. The reason for choosing R for the analysis is the large quantity of data we were handling and the numbers of calculation we needed to make. R has the capabilities of being able to produce results for large quantity of data, which makes R irreplaceable for this sort of research.
The variables required for the Carstairs score had different names in the dataset for each census years. For example, in the overcrowded variable, total households in 2011 was named ‘all household’ whereas, in 2001 it was given a codename ‘cs0520001’. As a result, we decided to change the names so that it was consistence throughout our calculations.
The geographical names of the variables also had different headings and codes. For example, what 2011 called the ‘GEO_CODE’, 2011 called the ‘Zone. Code’. For these reasons, the script was altered to adapt to these changes. Although these are minor adjustments, they were compulsory steps which needed to be made for the R scripts to function without any difficulties.
I have learnt to use R to calculate the proportions, mean, standard deviation and Zscores for my project and perform many other functions as well.
The calculations for the project were completed and the next step was trying to match the Zscores with the published results to ensure that the variables used in calculating scores were correct and to support out research project as well. We discovered that the variables and geographical areas of the different census that we used were correct but Some research papers’ Zscores were different from ours as they did not include information on how they weighted the population count. After reading many research papers and trying to match their results with ours, we finally found one paper whose scores match our very closely.
The Carstairs Z-scores were spilt into quintiles, ranging from
- 1 least deprived to
- 5 most deprived
A quintile is a statistical value of a data set that signifies 20% of a given population, so the first quintile symbolises the lowest fifth of the data and the second quintile represents the second fifth and so on. It should be acknowledged that the quintiles in this project are based on area meaning that 20% of all areas fall into each quintile. We calculated the quintiles in R, and the formula divided the fifth quintile’s sum by the data-set sum. The reason for creating quintiles was for map visualisation, which allowed us to make observation on which areas are most and least deprived across the selected geographical province in the UK. Therefore, quintiles were very convenient for plotting map visualisations as they offer a geographical perspective of the spread of deprivation across the UK.
QGIS is software that allows users to examine and edit spatial information, as well as composing and exporting graphical maps. For our research, this software was used to generate 3D maps of the Carstairs scores for data visualisation.
At first, I practised on QGIS using the census data that interested me, so I chose to use the Pakistan census data of 2017 and 1998 to look at the difference in the gender of Pakistan in 3D maps. From the UK census of 2011 I explored how language proficiency in English could have an influence on general health. The 3D maps with my data were able to show me which places had the highest peaks and where it was the lowest.
For the main project, we created 3D maps of the whole of the UK at ward, Ouput Area and local authority level from 1991 to 2011. 3D maps of Great Britain (i.e. England, Wales and Scotland) were also created at District level.
To produce Carstairs scores from the first census of 1971 to the most recent one of 2011, Northern Ireland could not be included in the calculation, as there is no census data available for Northern Ireland in the years for 1981 and 1971. As a result, Northern Ireland was excluded in this analysis.
For detailed examination of deprivation, 3D maps of the capital cities of the UK and Greater Manchester of output areas were created as well. The boundary data for the maps were downloaded from Casweb and borders from the UK Data Service. The boundary data was simplified on mapshaper and the quintiles were dissolved into 5 layers, so that it was easier for people to understand and produce the 3D maps.
The main findings of this research are portrayed in Tables 1 and 2. Table 1 shows the most deprived areas in the UK, while Table 2 shows the least deprived areas based on the area’s total score. The results are listed for all the years and output level that the data was available for. The least deprived areas, as can be seen in Table 2, are mostly located around London, in the South of England. This finding has been consistent since 1981 till 2011, however, in 1971 Bearsden was the least deprived which is in Scotland.
To understand the overall change in the level of deprivation across Great Britain, maps for 1981 1991, 2001 and 2011 were created using quintiles. The lighter colours represent less deprived areas and vice versa. I
Based on these scores, deprivation in Great Britain has decreased greatly between 1981 and 2011. The largest change has occurred in Scotland, compared with the rest of Great Britain. However, when comparing Scotland with England and Wales, it is still more deprived.
There has been a positive change for England as well, however, it has not been into such extent as for the other countries. The north of England and Cornwall have improved their deprivation scores the most.
Nevertheless, cities in the North, the areas in Birmingham and in London continue to score highly in their deprivation scores. There is a trend emerging throughout the four censuses of GB, which is that generally the south has always been less deprived when compared to the North.
Great Britain 1981
Great Britain 1991
Great Britain 2001
Great Britain 2011
Below are maps of Manchester and Greater Manchester showing deprivation at Output Area level.
Darker colours represent more deprived areas (higher quintile). Most of Greater Manchester, and especially Manchester is greatly deprived with some exceptions. There is a pattern as well, which illustrates that the inner part of all towns in Greater Manchester are deprived and the outskirts are lighter which means less deprived.
Manchester by Output Area 2011
Greater Manchester by OA 2011
It needs to be taken into consideration that one of the indicators of deprivation for the Carstairs score is ‘Car ownership’. Considering that owning a car in the city centre might not be convenient, it is to be expected that city centre areas may score higher on this variable resulting in the overall score being higher.
My 3D maps above also correspond with the results of the Index of Multiple Deprivation scores, which have stated that “Manchester is one of the local authority districts which has the largest proportions of highly deprived neighbourhoods in England” (The English Indices of Deprivation 2015, Baljit Gill).
The next stage is using the 3D maps produced to present them in VR. Virtual Reality is defined as “the use of computer technology to create a simulated environment”. Although VR is still in its developing stages, but it is being used across various platforms and presenting data is one of them.Follow @UKDataService