Go back to the projects page.
Summary
This analysis uses public EV chargepoint and plug-in vehicle (PiV) registration data to understand, by local authority district (LAD), which areas have the best and worst EV infrastructure.
What data is included?
First, four features are calculated to allow for initial exploration of the data. Details of these are as follows:
Count of PiVs
Source: DVLA and DfT dataset
- A simple count of registered PiVs by LAD is calculated.
- This includes all PiVs with
Private
keepership. This excludes PiVs where the keepership isCompany
to avoid skewing the analysis with large quantities of fleet vehicles registered to a single address. - Where data has been suppressed in the raw file, the
[c]
value has been replaced withNaN
Count of EV chargepoints
Source: Public EV chargepoint registry
- A simple count of public EV chargepoints by LAD is calculated.
- There has been no cleaning of this data beyond converting the
latitude
andlongitude
into apoint
. This is a public dataset and further accuracy has not been verified.
Ratio of PiVs to EV chargepoints
Source: Author’s calculations
- Taking the count of PiVs in each LAD and dividing this by the count of public EV chargepoints in each LAD.
- A higher ratio means there are more PiVs to each chargepoint and therefore public infrastructure may be lacking.
Average distance to nearest EV chargepoint
Source: Geographies from ONS via Open Geography Portal
- Using the population-weighted centroids (PWCs) for each output area (OA) within an LAD, the nearest public EV chargepoint is calculated. OA is the smallest census geography and contains between 40 and 250 households.
- Then this is aggregated to LAD level by taking the average distance of all OAs within the LAD to obtain the average distance to the nearest public EV chargepoint by LAD.
What do these features show?
The following choropleths map these four features. The absolute number of PiVs varies across England and Wales, with LADs across Wales, Lincolnshire and Cumbria having particularly low numbers. By contrast, the highest number of public EV chargepoints by LAD is concentrated in Greater London, with hotspots dotted elsewhere across the country. If we instead look at this as the ratio of PiVs to chargepoints, we again see the patterns change. Hotspots around the southeast of England highlight that there is potentially infrastructure lacking in these areas compared to PiV numbers. Lastly, we can see the average distance to the nearest chargepoint is perhaps unsurprisingly highest in rural areas, particularly in the north of England and Wales.
Code to create an interactive version of this map can be found below.
What is the relationship between the features?
The correlation matrix heatmap below shows the relationship between each feature. There is some positive correlation between chargepoints and PiVs, which is good as this suggests that where there are more PiVs there tend to be more chargepoints. There is also some positive correlation between average distance to the nearest chargepoint and the ratio of PiVs to chargepoints. This means that if the distance is greater, there are also more PiVs to chargepoints perhaps suggesting these areas lack infrastructure.
The following scatter plots illustrate this further, adding a regression line. We can see that there are outliers across both, with a cluster of LADs on the lower end.
So what does this mean for public EV infrastructure?
Taking the ratio of PiVs to public EV chargepoints and the average distance to the nearest chargepoint, a simple ranking is created combining the two. Areas with a high ratio of PiVs to chargepoints and a large distance to the nearest chargepoint could be lacking in public EV infrastructure.
Top 10 The best performing LADs were all in London, suggesting infrastructure in the capital is well established.
LAD22CD | LAD22NM | combined_rank | |
---|---|---|---|
288 | E09000013 | Hammersmith and Fulham | 1.0 |
308 | E09000033 | Westminster | 2.0 |
303 | E09000028 | Southwark | 3.0 |
276 | E09000001 | City of London | 4.0 |
295 | E09000020 | Kensington and Chelsea | 4.0 |
299 | E09000024 | Merton | 6.0 |
307 | E09000032 | Wandsworth | 6.0 |
282 | E09000007 | Camden | 8.0 |
287 | E09000012 | Hackney | 9.0 |
294 | E09000019 | Islington | 9.0 |
Bottom 10 The worst performing LADs are more dispersed. Most are more rural areas, or include only one or two larger towns.
LAD22CD | LAD22NM | combined_rank | |
---|---|---|---|
99 | E07000074 | Maldon | 330.0 |
89 | E07000064 | Rother | 329.0 |
235 | E07000242 | East Hertfordshire | 328.0 |
16 | E06000017 | Rutland | 327.0 |
100 | E07000075 | Rochford | 326.0 |
85 | E07000047 | West Devon | 325.0 |
177 | E07000169 | Selby | 324.0 |
160 | E07000139 | North Kesteven | 323.0 |
82 | E07000044 | South Hams | 321.0 |
199 | E07000198 | Staffordshire Moorlands | 321.0 |
This information could be used by both public and private organisations alike to understand where public EV infrastructure is lacking and therefore incentivise installation of new infrastructure.
Considerations and future work
This analysis has demonstrated that current public EV structure may be lacking in some areas based on the current number of PiVs. This does not include:
- A temporal view of how PiV ownership has changed over time - there may be areas experiencing higher growth, with could arguably justify increased investment over other areas.
- EV infrastructure on private property (e.g. at home) - areas with a higher number of terraced houses or flats may require more public infrastructure as it can often be more challenging to install at these types of properties. Areas with higher rates of rented accomodation over private ownership could also face challenges as it would be the responsibility of the landlord to install EV chargepoints. Again, these areas may require more public infrastructure and so overlaying this data could be an interesting investigation to see how it affects the ranking.
The public EV chargepoint registry has also not been quality checked. Further investigation into this dataset and wrangling/cleaning as appropriate may impact results.
Coding
Import libraries
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
import seaborn as sns
from scipy.stats import zscore
import matplotlib.pyplot as plt
import contextily as ctx
import folium
Data wrangling
EV Chargepoint data
- Public EV chargepoint registry, available from: https://www.gov.uk/guidance/find-and-use-data-on-public-electric-vehicle-chargepoints
# Import csv
= pd.read_csv('Data/national-charge-point-registry_230524.csv', low_memory=False) chargepoints_RAW
# Reduce to just required columns
= chargepoints_RAW[['chargeDeviceID','reference','name','latitude','longitude']] chargepoints_TRIM
# Check length of df
len(chargepoints_TRIM)
40580
# Visual check
chargepoints_TRIM.head()
chargeDeviceID | reference | name | latitude | longitude | |
---|---|---|---|---|---|
0 | b86a77a42bb68c81946ec50cfc95e89d | 11172306P | Network Rail Westwood Centre 1 | 52.386590 | -1.587384 |
1 | dc1c347d471f68e41ad2a9a1145941d6 | APT-0296-0015/13P | Brindley Drive Car Park Birmingham - 70524 | 52.480918 | -1.907710 |
2 | 7d545ad9367ccb8a80c94a953314ae71 | CM123 | Renault Liverpool | 53.383579 | -2.977230 |
3 | 68c7fca1e3bba5e49ec90847dcdd456b | CM164 | NCP Portman Square | 51.516201 | -0.157996 |
4 | ac7a21c48f5833b33a5b606b2089e6a9 | CM167 | NCP Prince Street Car Park | 51.450340 | -2.596704 |
# Geocode locations using lat and long, set crs to 4326 and then convert to 27700 (British National Grid), drop lat and long
= gpd.GeoDataFrame(chargepoints_TRIM,
chargepoints_gdf =gpd.points_from_xy(chargepoints_TRIM.longitude, chargepoints_TRIM.latitude)
geometry=4326, inplace=True).to_crs(epsg=27700).drop(columns=['latitude','longitude']) ).set_crs(epsg
# Visual check
chargepoints_gdf.head()
chargeDeviceID | reference | name | geometry | |
---|---|---|---|---|
0 | b86a77a42bb68c81946ec50cfc95e89d | 11172306P | Network Rail Westwood Centre 1 | POINT (428179.411 276586.797) |
1 | dc1c347d471f68e41ad2a9a1145941d6 | APT-0296-0015/13P | Brindley Drive Car Park Birmingham - 70524 | POINT (406364.849 287003.294) |
2 | 7d545ad9367ccb8a80c94a953314ae71 | CM123 | Renault Liverpool | POINT (335096.815 387859.900) |
3 | 68c7fca1e3bba5e49ec90847dcdd456b | CM164 | NCP Portman Square | POINT (527908.652 181305.630) |
4 | ac7a21c48f5833b33a5b606b2089e6a9 | CM167 | NCP Prince Street Car Park | POINT (358631.700 172541.813) |
EV Cars data
- df_VEH0145: Licensed plug-in vehicles (PiVs) at the end of the quarter by fuel type and lower super output area (LSOA): United Kingdom, available from: https://www.gov.uk/government/statistical-data-sets/vehicle-licensing-statistics-data-files
# Import csv
= pd.read_csv('Data/df_VEH0145.csv', low_memory=False) PiVs_RAW
# Reduce to just required columns
= PiVs_RAW[['LSOA11CD','LSOA11NM','Fuel','Keepership','2023 Q4']] PiVs_TRIM
# Visual check
PiVs_TRIM.head()
LSOA11CD | LSOA11NM | Fuel | Keepership | 2023 Q4 | |
---|---|---|---|---|---|
0 | 95AA01S1 | Aldergrove 1 | Battery electric | Company | [c] |
1 | 95AA01S2 | Aldergrove 2 | Battery electric | Company | 9 |
2 | 95AA01S3 | Aldergrove 3 | Battery electric | Company | [c] |
3 | 95AA02W1 | Balloo | Battery electric | Company | 5 |
4 | 95AA03W1 | Ballycraigy | Battery electric | Company | [c] |
# Filter to all Fuel and private Keepership types, then drop those columns
= PiVs_TRIM[(PiVs_TRIM['Fuel'] == 'Total') &
PiVs_FILTERED 'Keepership'] == 'Private')].drop(columns=['Fuel', 'Keepership']) (PiVs_TRIM[
# Replace the suppressed [c] data with NaN
'2023 Q4'] = PiVs_FILTERED['2023 Q4'].replace('[c]', np.nan)
PiVs_FILTERED[
# Convert the '2023 Q4' column to numeric data type
'2023 Q4'] = pd.to_numeric(PiVs_FILTERED['2023 Q4']) PiVs_FILTERED[
# Visual check
PiVs_FILTERED.head()
LSOA11CD | LSOA11NM | 2023 Q4 | |
---|---|---|---|
189862 | 95AA01S1 | Aldergrove 1 | NaN |
189863 | 95AA01S2 | Aldergrove 2 | 15.0 |
189864 | 95AA01S3 | Aldergrove 3 | 19.0 |
189865 | 95AA02W1 | Balloo | 10.0 |
189866 | 95AA03W1 | Ballycraigy | NaN |
LSOA lookup
The PiV data uses the LSOA codes from 2011 and so this will need to be changed to the LSOA 2021 codes.
- LSOA best fit lookup 2011 to 2021, available from: https://geoportal.statistics.gov.uk/datasets/b14d449ba10a48508bd05cd4a9775e2b_0/explore
# Import csv
= pd.read_csv(
LSOA_lookup 'Data/LSOA_(2011)_to_LSOA_(2021)_to_Local_Authority_District_(2022)_Best_Fit_Lookup_for_EW_(V2).csv',
=False) low_memory
# Reduce to just required columns
= LSOA_lookup[['LSOA11CD','LSOA21CD','LAD22CD','LAD22NM']].copy() LSOA_lookup
# Match 2021 to 2011 codes using lookup
= pd.merge(PiVs_FILTERED, LSOA_lookup, on='LSOA11CD') PiVs_FILTERED
# Visual check
PiVs_FILTERED.head()
LSOA11CD | LSOA11NM | 2023 Q4 | LSOA21CD | LAD22CD | LAD22NM | |
---|---|---|---|---|---|---|
0 | E01000001 | City of London 001A | 28.0 | E01000001 | E09000001 | City of London |
1 | E01000002 | City of London 001B | 30.0 | E01000002 | E09000001 | City of London |
2 | E01000003 | City of London 001C | 15.0 | E01000003 | E09000001 | City of London |
3 | E01000005 | City of London 001E | NaN | E01000005 | E09000001 | City of London |
4 | E01000006 | Barking and Dagenham 016A | 18.0 | E01000006 | E09000002 | Barking and Dagenham |
LADs
- LADs 2021 polygons, available from: https://geoportal.statistics.gov.uk/datasets/305779d69bf44feea05eeaa78ca26b5f_0/explore
# Import shp file
= gpd.read_file('Data/Local_Authority_Districts_December_2022_UK_BFC_V2_-177113771882051469/LAD_DEC_2022_UK_BFC_V2.shp') LADs
# Reduce to just required columns
= LADs[['LAD22CD','LAD22NM','geometry']].copy() LADs
# Visual check
LADs.head()
LAD22CD | LAD22NM | geometry | |
---|---|---|---|
0 | E06000001 | Hartlepool | MULTIPOLYGON (((450154.599 525938.201, 450140…. |
1 | E06000002 | Middlesbrough | MULTIPOLYGON (((446854.700 517192.700, 446854…. |
2 | E06000003 | Redcar and Cleveland | MULTIPOLYGON (((451747.397 520561.100, 451792…. |
3 | E06000004 | Stockton-on-Tees | MULTIPOLYGON (((447177.704 517811.797, 447176…. |
4 | E06000005 | Darlington | POLYGON ((423496.602 524724.299, 423497.204 52… |
len(LADs)
374
OAs
This analysis will use the population-weighted centroids (PWCs) to find the nearest EV chargepoints. This will then be aggregated to LAD level, so the OA to LAD lookup is required.
- OA 2021 PWCs, available from: https://geoportal.statistics.gov.uk/datasets/b9b2b2440af240ce9d30a1d39a7507c2_0/explore
- OA to LAD lookup, available from: https://geoportal.statistics.gov.uk/datasets/b9ca90c10aaa4b8d9791e9859a38ca67_0/explore
OA PWCs
# Import shp file
= gpd.read_file('Data/Output_Areas_2021_PWC_V3_-1981902074309169314/PopCentroids_EW_2021_V3.shp') OA_PWC
# Reduce to just required columns
= OA_PWC[['OA21CD','geometry']].copy() OA_PWC
OA to LAD lookup
# Import csv
= pd.read_csv(
OA_lookup 'Data/Output_Area_to_Lower_layer_Super_Output_Area_to_Middle_layer_Super_Output_Area_to_Local_Authority_District_(December_2021)_Lookup_in_England_and_Wales_v3.csv',
=False) low_memory
# Keep only required columns
= OA_lookup[['OA21CD','LSOA21CD','LAD22CD']].copy() OA_lookup
# Visual check
OA_lookup.head()
OA21CD | LSOA21CD | LAD22CD | |
---|---|---|---|
0 | E00060358 | E01011968 | E06000001 |
1 | E00060359 | E01011968 | E06000001 |
2 | E00060360 | E01011968 | E06000001 |
3 | E00060361 | E01011968 | E06000001 |
4 | E00060362 | E01011970 | E06000001 |
Feature engineering
Aggregate EV car ownership data to LAD
# Aggregate to LAD and calculate total number of PiVs
= PiVs_FILTERED.groupby('LAD22CD')['2023 Q4'].agg('sum').reset_index()
PiVs_LAD
# Rename column to PiVs
={'2023 Q4': 'PiVs'}, inplace=True) PiVs_LAD.rename(columns
# Visual check
PiVs_LAD.head()
LAD22CD | PiVs | |
---|---|---|
0 | E06000001 | 410.0 |
1 | E06000002 | 391.0 |
2 | E06000003 | 580.0 |
3 | E06000004 | 1308.0 |
4 | E06000005 | 719.0 |
len(PiVs_LAD)
331
Calculate distance to nearest chargepoint
- Using the OA PWCs, calculate the nearest EV chargepoint.
- Aggregate this to LAD level to get the average distance to the nearest chargepoint by LAD.
# Find the nearest chargepoint to each OA PWC and calculate distance in metres
= OA_PWC.sjoin_nearest(chargepoints_gdf, distance_col='distance', how='left') OA_nearest_chargepoint
# Merge on OA_lookup_Leeds to get the LAD22CDs
= pd.merge(OA_nearest_chargepoint, OA_lookup, on='OA21CD', how='inner') OA_nearest_chargepoint
# Keep only required columns
= OA_nearest_chargepoint[['OA21CD','geometry','distance','LAD22CD']].copy() OA_nearest_chargepoint
# Aggregate to LAD and calculate average distance to nearest chargepoint
= OA_nearest_chargepoint.groupby('LAD22CD')['distance'].agg('mean').reset_index()
avg_dist_nearest_chargepoint_LAD
# Rename column to avg_distance
={'distance': 'avg_distance'}, inplace=True) avg_dist_nearest_chargepoint_LAD.rename(columns
# Visual check
avg_dist_nearest_chargepoint_LAD.head()
LAD22CD | avg_distance | |
---|---|---|
0 | E06000001 | 1649.498055 |
1 | E06000002 | 1088.305461 |
2 | E06000003 | 1454.585351 |
3 | E06000004 | 911.259807 |
4 | E06000005 | 1188.923394 |
len(avg_dist_nearest_chargepoint_LAD)
331
Calculate number of EV chargepoints in each LAD
- Aggregate the EV chargepoint data to LAD level to get total number of chargepoints in each LAD.
# Spatially match
= gpd.sjoin(LADs, chargepoints_gdf, predicate='intersects') chargepoints_LAD
# Aggregate to LAD and calculate total number of chargepoints
= chargepoints_LAD.groupby('LAD22CD')['index_right'].agg('count').reset_index()
chargepoints_LAD
# Rename column to total_chargepoints
={'index_right': 'total_chargepoints'}, inplace=True) chargepoints_LAD.rename(columns
# Visual check
chargepoints_LAD.head()
LAD22CD | total_chargepoints | |
---|---|---|
0 | E06000001 | 42 |
1 | E06000002 | 51 |
2 | E06000003 | 46 |
3 | E06000004 | 167 |
4 | E06000005 | 76 |
len(chargepoints_LAD)
373
Calculate ratio of EV cars to chargepoints in each LAD
# Merge PiVs and chargepoint count dataframes
= pd.merge(PiVs_LAD, chargepoints_LAD, on='LAD22CD', how='left')
ratio_PiVs_to_chargepoints
# Replace NaN values with 0 where there are no chargepoints in an MSOA
'total_chargepoints': 0}, inplace=True) ratio_PiVs_to_chargepoints.fillna({
# Calculate ratio
'ratio_PiVs_to_chargepoints'] = (
ratio_PiVs_to_chargepoints['PiVs'] / ratio_PiVs_to_chargepoints['total_chargepoints'])
ratio_PiVs_to_chargepoints[
# Replace 'inf' with NaN
'ratio_PiVs_to_chargepoints'] = ratio_PiVs_to_chargepoints[
ratio_PiVs_to_chargepoints['ratio_PiVs_to_chargepoints'].replace([np.inf, -np.inf], np.nan)
Add average distance and LAD polygon to final dataframe
# Add nearest chargepoint
= pd.merge(ratio_PiVs_to_chargepoints,avg_dist_nearest_chargepoint_LAD, on='LAD22CD')
final_df
# Add LAD polygons
= pd.merge(LADs, final_df, on='LAD22CD') final_df
# Visual check
final_df.head()
LAD22CD | LAD22NM | geometry | PiVs | total_chargepoints | ratio_PiVs_to_chargepoints | avg_distance | |
---|---|---|---|---|---|---|---|
0 | E06000001 | Hartlepool | MULTIPOLYGON (((450154.599 525938.201, 450140…. | 410.0 | 42.0 | 9.761905 | 1649.498055 |
1 | E06000002 | Middlesbrough | MULTIPOLYGON (((446854.700 517192.700, 446854…. | 391.0 | 51.0 | 7.666667 | 1088.305461 |
2 | E06000003 | Redcar and Cleveland | MULTIPOLYGON (((451747.397 520561.100, 451792…. | 580.0 | 46.0 | 12.608696 | 1454.585351 |
3 | E06000004 | Stockton-on-Tees | MULTIPOLYGON (((447177.704 517811.797, 447176…. | 1308.0 | 167.0 | 7.832335 | 911.259807 |
4 | E06000005 | Darlington | POLYGON ((423496.602 524724.299, 423497.204 52… | 719.0 | 76.0 | 9.460526 | 1188.923394 |
Handle outliers
There are some outliers on the top end, so anything outside 2sd will be amended.
# Loop through each variable and calculate mean, std, and threshold
for variable in ['PiVs', 'total_chargepoints', 'ratio_PiVs_to_chargepoints', 'avg_distance']:
= final_df[variable].mean()
mean_var = final_df[variable].std()
std_var = mean_var + 2 * std_var
threshold_var
# Create new column based on outlier condition
f'amended_{variable}'] = final_df[variable].apply(lambda x: threshold_var if x > threshold_var else x) final_df[
# Visual check
final_df.head()
LAD22CD | LAD22NM | geometry | PiVs | total_chargepoints | ratio_PiVs_to_chargepoints | avg_distance | amended_PiVs | amended_total_chargepoints | amended_ratio_PiVs_to_chargepoints | amended_avg_distance | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | E06000001 | Hartlepool | MULTIPOLYGON (((450154.599 525938.201, 450140…. | 410.0 | 42.0 | 9.761905 | 1649.498055 | 410.0 | 42.0 | 9.761905 | 1649.498055 |
1 | E06000002 | Middlesbrough | MULTIPOLYGON (((446854.700 517192.700, 446854…. | 391.0 | 51.0 | 7.666667 | 1088.305461 | 391.0 | 51.0 | 7.666667 | 1088.305461 |
2 | E06000003 | Redcar and Cleveland | MULTIPOLYGON (((451747.397 520561.100, 451792…. | 580.0 | 46.0 | 12.608696 | 1454.585351 | 580.0 | 46.0 | 12.608696 | 1454.585351 |
3 | E06000004 | Stockton-on-Tees | MULTIPOLYGON (((447177.704 517811.797, 447176…. | 1308.0 | 167.0 | 7.832335 | 911.259807 | 1308.0 | 167.0 | 7.832335 | 911.259807 |
4 | E06000005 | Darlington | POLYGON ((423496.602 524724.299, 423497.204 52… | 719.0 | 76.0 | 9.460526 | 1188.923394 | 719.0 | 76.0 | 9.460526 | 1188.923394 |
Visualising data
The following code builds an interactive map of the features in final_df
. This is saved down into an Outputs/
folder as an HTML file.
# Create the individual layers
= final_df.explore(
m ="amended_PiVs",
column="naturalbreaks",
scheme='YlOrRd',
cmap=False,
legend=10,
k=["LAD22CD", "LAD22NM", "amended_PiVs"],
tooltip=dict(color='black', weight=0.5, fillOpacity=0.8),
style_kwds=dict(fillOpacity=1),
highlight_kwds="Plug-in Vehicles",
name=True)
show
final_df.explore(=m,
m="amended_total_chargepoints",
column="naturalbreaks",
scheme='YlOrRd',
cmap=False,
legend=10,
k=["LAD22CD", "LAD22NM", "amended_total_chargepoints"],
tooltip=dict(color='black', weight=0.5, fillOpacity=0.8),
style_kwds=dict(fillOpacity=1),
highlight_kwds="Total EV Chargepoints",
name=False)
show
final_df.explore(=m,
m="amended_ratio_PiVs_to_chargepoints",
column="naturalbreaks",
scheme='YlOrRd',
cmap=False,
legend=10,
k=["LAD22CD", "LAD22NM", "amended_ratio_PiVs_to_chargepoints"],
tooltip=dict(color='black', weight=0.5, fillOpacity=0.8),
style_kwds=dict(fillOpacity=1),
highlight_kwds="Ratio of Plug-in Vehicles to EV Chargepoints",
name=False)
show
final_df.explore(=m,
m="amended_avg_distance",
column="naturalbreaks",
scheme='YlOrRd',
cmap=False,
legend=10,
k=["LAD22CD", "LAD22NM", "amended_avg_distance"],
tooltip=dict(color='black', weight=0.5, fillOpacity=0.8),
style_kwds=dict(fillOpacity=1),
highlight_kwds="Average distance to nearest EV Chargepoint by OA",
name=False)
show
# Add the map base
"CartoDB positron", show=True).add_to(m)
folium.TileLayer(
# Add layer control to map
folium.LayerControl().add_to(m)
# Save the map to an HTML file
"Outputs/Interactive_map.html") m.save(
The following code builds four choropleth maps of the features in final_df
. This is saved down into an Outputs/
folder as a .png file.
# Create 2x2 subplots
= plt.subplots(2, 2, figsize=(12, 12))
fig, axs
# Define aliases for column names
= {
column_aliases 'amended_PiVs': 'Plug-in Vehicles (PiVs)',
'amended_total_chargepoints': 'Total EV Chargepoints (EVCs)',
'amended_ratio_PiVs_to_chargepoints': 'Ratio of PiVs to EVCs',
'amended_avg_distance': 'Average distance to nearest EVC by OA (metres)'
}
# Loop through each variable and corresponding subplot
= list(column_aliases.keys())
variables for variable, ax in zip(variables, axs.flatten()):
=variable, cmap='YlOrRd', ax=ax, legend=True)
final_df.plot(columnf'{column_aliases[variable]}') # Set the title using the alias
ax.set_title(# Turn off axis
ax.set_axis_off()
=0.05, hspace=0.05) # Adjust space between subplots
plt.subplots_adjust(wspace# Adjust layout to prevent overlap
plt.tight_layout()
# Export to Outputs/ folder
'Outputs/Feature_choropleths.png', dpi=300)
plt.savefig(
# Close figure
plt.close(fig)
The following code creates two scatterplots with regression lines for selected features from the final_df
. This is saved down into an Outputs/
folder as a .png file.
# Create a subplot grid with 1 row and 2 columns
= plt.subplots(1, 2, figsize=(15, 6))
fig, axs
# Plot the first scatter plot
='amended_PiVs', y='amended_total_chargepoints', data=final_df, ax=axs[0])
sns.scatterplot(x='amended_PiVs', y='amended_total_chargepoints', data=final_df, scatter=False, ax=axs[0])
sns.regplot(x0].set_xlabel(column_aliases['amended_PiVs']) # Set x-axis label using alias
axs[0].set_ylabel(column_aliases['amended_total_chargepoints']) # Set y-axis label using alias
axs[0].set_title('Plug-in Vehicles vs. Total EV Chargepoints') # Set the title
axs[
# Plot the second scatter plot
='amended_ratio_PiVs_to_chargepoints', y='amended_avg_distance', data=final_df, ax=axs[1])
sns.scatterplot(x='amended_ratio_PiVs_to_chargepoints', y='amended_avg_distance', data=final_df, scatter=False, ax=axs[1])
sns.regplot(x1].set_xlabel(column_aliases['amended_ratio_PiVs_to_chargepoints']) # Set x-axis label using alias
axs[1].set_ylabel(column_aliases['amended_avg_distance']) # Set y-axis label using alias
axs[1].set_title('Ratio vs. Distance') # Set the title
axs[
# Adjust layout
plt.tight_layout()
# Export to Outputs/ folder
'Outputs/Scatter_Reg_Plots.png', dpi=300)
plt.savefig(
# Close figure
plt.close(fig)
The following code creates a correlation matrix heatmap of the features in final_df
. This is saved down into an Outputs/
folder as a .png file.
# Calculate the correlation matrix
= final_df[[
correlation_matrix 'amended_PiVs','amended_total_chargepoints',
'amended_ratio_PiVs_to_chargepoints','amended_avg_distance']].corr()
# Generate a mask for the diagonal cells
= np.triu(np.ones_like(correlation_matrix, dtype=bool))
mask
# Generate a heatmap
=(8, 7))
plt.figure(figsize='RdYlGn', annot=True,
sns.heatmap(correlation_matrix, cmap=".2f", mask=mask, vmin=-1, vmax=1, cbar_kws={"shrink": 0.75}, linewidth=.5)
fmt
# Rotate axis labels
=45)
plt.xticks(rotation=45)
plt.yticks(rotation
# Set axis labels using aliases and wrap text
=range(len(correlation_matrix.columns)),
plt.xticks(ticks=['Plug-in Vehicles',
labels'Total EV Chargepoints',
'Ratio of PiVs to EVCs',
'Avg dist to nearest EVC'])
=np.arange(len(correlation_matrix.columns))+0.5,
plt.yticks(ticks=['Plug-in Vehicles',
labels'Total EV Chargepoints',
'Ratio of PiVs to EVCs',
'Avg dist to nearest EVC'])
'Correlation Matrix Heatmap', fontweight='bold')
plt.title(
plt.tight_layout()
# Export to Outputs/ folder
'Outputs/CorrelationMatrix.png', dpi=300)
plt.savefig(
# Close figure
plt.close()
Creating a ranking
The following code creates a simple ranking of LADs based on:
- Average distance to the nearest EV chargepoint
- Ratio of Plug-in vehicles to EV chargepoints
The idea is that if the average distance is high, and the ratio is high, then these areas may be lacking public EV infrastructure.
# Create a new dataframe containing just the amended features for the ranking
= final_df.drop(final_df.columns[3:9], axis=1) ranking
# Rank the values in each column
'rank_ratio'] = ranking['amended_ratio_PiVs_to_chargepoints'].rank(ascending=True, method='min')
ranking['rank_distance'] = ranking['amended_avg_distance'].rank(ascending=True, method='min') ranking[
# Compute the combined ranking
'combined_rank'] = (ranking['rank_ratio'] + ranking['rank_distance']) / 2
ranking['combined_rank'] = ranking['combined_rank'].rank(ascending=True, method='min') ranking[
The following code creates a map of the top 10 combined_rank
LADs in ranking
. This is saved down into an Outputs/
folder as a .png file.
# Get top 10 areas
'LAD22CD','LAD22NM','combined_rank']].sort_values(by='combined_rank', ascending=True).head(10) ranking[[
LAD22CD | LAD22NM | combined_rank | |
---|---|---|---|
288 | E09000013 | Hammersmith and Fulham | 1.0 |
308 | E09000033 | Westminster | 2.0 |
303 | E09000028 | Southwark | 3.0 |
276 | E09000001 | City of London | 4.0 |
295 | E09000020 | Kensington and Chelsea | 4.0 |
299 | E09000024 | Merton | 6.0 |
307 | E09000032 | Wandsworth | 6.0 |
282 | E09000007 | Camden | 8.0 |
287 | E09000012 | Hackney | 9.0 |
294 | E09000019 | Islington | 9.0 |
# Sort and select the top 10 areas
= ranking.sort_values(by='combined_rank', ascending=True).head(10)
top_10_areas
# Plot only the top 10 areas
= plt.subplots(1, 1, figsize=(8, 10))
fig, ax =ax, color='#88C88E', edgecolor='black', linewidth=1, legend=True, alpha=0.5)
top_10_areas.plot(ax
# Annotate the top 10 areas with their rank
for idx, row in top_10_areas.iterrows():
=int(row['combined_rank']), xy=(row.geometry.centroid.x, row.geometry.centroid.y),
plt.annotate(text='center', fontsize=12, weight='bold', color='black')
horizontalalignment
# Plot basemap
=top_10_areas.crs.to_string(), source=ctx.providers.CartoDB.Positron)
ctx.add_basemap(ax, crs
# Set title and remove axis
'Top 10 Areas by Combined Rank', fontsize=12, fontweight='bold')
ax.set_title(
ax.set_axis_off()
# Adjust layout
plt.tight_layout()
# Export to Outputs/ folder
'Outputs/Map_of_Top_10.png', dpi=300)
plt.savefig(
# Close figure
plt.close(fig)
The following code creates a map of the bottom 10 combined_rank
LADs in ranking
. This is saved down into an Outputs/
folder as a .png file.
# Show bottom 10 areas
'LAD22CD','LAD22NM','combined_rank']].sort_values(by='combined_rank', ascending=False).head(10) ranking[[
LAD22CD | LAD22NM | combined_rank | |
---|---|---|---|
99 | E07000074 | Maldon | 330.0 |
89 | E07000064 | Rother | 329.0 |
235 | E07000242 | East Hertfordshire | 328.0 |
16 | E06000017 | Rutland | 327.0 |
100 | E07000075 | Rochford | 326.0 |
85 | E07000047 | West Devon | 325.0 |
177 | E07000169 | Selby | 324.0 |
160 | E07000139 | North Kesteven | 323.0 |
82 | E07000044 | South Hams | 321.0 |
199 | E07000198 | Staffordshire Moorlands | 321.0 |
# Sort and select the bottom 10 areas
= ranking.sort_values(by='combined_rank', ascending=False).head(10)
bottom_10_areas
# Plot only the top 10 areas
= plt.subplots(1, 1, figsize=(10, 10))
fig, ax =ax, color='#DE6464', edgecolor='black', linewidth=1, legend=True, alpha=0.5)
bottom_10_areas.plot(ax
# Annotate the top 10 areas with their rank
for idx, row in bottom_10_areas.iterrows():
=int(row['combined_rank']), xy=(row.geometry.centroid.x, row.geometry.centroid.y),
plt.annotate(text='center', fontsize=12, weight='bold', color='black')
horizontalalignment
# Plot basemap
=bottom_10_areas.crs.to_string(), source=ctx.providers.CartoDB.Positron)
ctx.add_basemap(ax, crs
# Set title and remove axis
'Bottom 10 Areas by Combined Rank', fontsize=12, fontweight='bold')
ax.set_title(
ax.set_axis_off()
# Adjust layout
plt.tight_layout()
# Export to Outputs/ folder
'Outputs/Map_of_Bottom_10.png', dpi=300)
plt.savefig(
# Close figure
plt.close(fig)