Graph here, graph there
It is time to work with spatial weights matrices by yourself.
Zones of suburbanisation
You are familiar with Prague from the last section, so let’s zoom out to zones of suburbanisation around Prague and other Czech cities. Head over to the DataHub of the Faculty of Science and download the dataset called “Zóny rezidenční suburbanizace 2008-2016” containing the zones of residential suburbanisation outlined by Ouřednı́ček, Klsák, and Špačková (2019). Download the dataset and open it with geopandas
. Use "OBJECTID"
column as an index (other feasible columns are not unique - they contain duplicated entries).
Interaction with Graphs
- Create a contiguity matrix using the Queen criterion
- Let’s focus on Prague (ID
891
in the table). How many neighbours does it have? - Reproduce the previous section’s zoom plot with Prague and its neighbours. Can you make that plot as both static and interactive maps?
- Create a block spatial weights matrix where every geometry is connected to other geometries in the NUTS2 region.
- Create KNN weights with 5 neighbours. Remember that KNN expects point geometry.
- Compare the number of neighbours by geometry for the three weights matrices. Which one has more? Why? Can you compare distributions of number of neighbors as kde plots?
Spatial lag
Let’s have a look at spatial lag. Before proceeding, you will probably need to pre-process the column with the population ("obyv_31122"
) since it comes as string
. Assuming the GeoDataFrame
is called suburbanisation
, you can do the following to cast it to float
.
"obyv_31122"] = (
suburbanisation["obyv_31122"].replace("?", None).astype(float)
suburbanisation[ )
- Measure spatial lag (as mean, so don’t forget to standardise your weights) of the
"obyv_31122"
column using all weights matrices you have created. - What is the difference in results for Prague? Can you explain why?