Spatial weights in PySAL

In this session, you will be learning the ins and outs of one of the key pieces in spatial analysis: spatial weights matrices. These are structured sets of numbers that formalise geographical relationships between the observations in a dataset. Essentially, a spatial weights matrix of a given geography is a positive definite matrix of dimensions \(N\) by \(N\), where \(N\) is the total number of observations:

\[ W = \left(\begin{array}{cccc} 0 & w_{12} & \dots & w_{1N} \\ w_{21} & \ddots & w_{ij} & \vdots \\ \vdots & w_{ji} & 0 & \vdots \\ w_{N1} & \dots & \dots & 0 \end{array} \right) \]

where each cell \(w_{ij}\) contains a value that represents the degree of spatial contact or interaction between observations \(i\) and \(j\). A fundamental concept in this context is that of neighbour and neighbourhood. By convention, elements in the diagonal (\(w_{ii}\)) are set to zero. A neighbour of a given observation \(i\) is another observation with which \(i\) has some degree of connection. In terms of \(W\), \(i\) and \(j\) are thus neighbors if \(w_{ij} > 0\). Following this logic, the neighbourhood of \(i\) will be the set of observations in the system with which it has a certain connection or those observations with a weight greater than zero.

There are several ways to create such matrices and many more to transform them so they contain an accurate representation that aligns with the way you understand spatial interactions between the elements of a system. This session will introduce the most commonly used ones and show how to compute them with PySAL.

1import matplotlib.pyplot as plt
import contextily
import geopandas as gpd
import numpy as np
import pandas as pd
import seaborn as sns

from libpysal import graph
1
A common standard is to do all the imports on top of your notebook.

Data

For this tutorial, you will use a dataset of Basic Settlement Units (ZSJ) in Prague for 2021. The table is available as part of this course, so it can be accessed remotely through the web. If you want to see how the table was created, a notebook is available here.

To make things easier, you will read data from a file posted online so, for now, you do not need to download any dataset:

prague = gpd.read_file(
    "https://martinfleischmann.net/sds/spatial_graphs/data/zsj_prague_2021.gpkg"
)
1prague = prague.set_index("NAZ_ZSJ")
prague.explore()
1
Use the name of each observation as an index. It will help you link them to the weights matrix.
Make this Notebook Trusted to load map: File -> Trust Notebook

Instead of reading the file directly off the web, it is possible to download it manually, store it on your computer, and read it locally. To do that, you can follow these steps:

  1. Download the file by right-clicking on this link and saving the file
  2. Place the file in the same folder as the notebook where you intend to read it
  3. Replace the code in the cell above with:
prague = gpd.read_file(
    "zsj_prague_2021.gpkg",
)

Building spatial weights in PySAL

Contiguity

Contiguity weights matrices define spatial connections through the existence of shared boundaries. This makes it directly suitable to use with polygons: if two polygons share boundaries to some degree, they will be labelled as neighbours under these kinds of weights. Exactly how much they need to share is what differentiates the two approaches we will learn: Queen and Rook.

Queen

Under the Queen criteria, two observations only need to share a vertex (a single point) of their boundaries to be considered neighbours. Constructing a weights matrix under these principles can be done by running:

1queen = graph.Graph.build_contiguity(prague, rook=False)
queen
1
The main class that represents a weights matrix is Graph. rook=False specifies to use Queen contiguity, while rook=True would create Rook contiguity.
<Graph of 953 nodes and 5670 nonzero edges indexed by
 ['Běchovice', 'Nová Dubeč', 'Benice', 'Březiněves', 'Dolní Černošice', ...]>

The command above creates an object queen of the class Graph. This is the format in which spatial weights matrices are stored in PySAL. By default, the weights builder (Graph.build_contiguity() will use the index of the table, which is useful so you can keep everything in line easily.

New Graph and old W

The graph module of libpysal is an implementation of spatial weights matrices released in September 2023. In the older resources, you will find the weights module and the W class instead. Graph will eventually replace W. Their API is similar, but there are some differences. Pay attention to the documentation when porting code from weights-based resources to graph-based implementation. Or use Graph.to_W() and Graph.from_W() to convert one to the other.

A Graph object can be queried to find out about the contiguity relations it contains. For example, if you would like to know who is a neighbour of observation Albertov:

queen['Albertov']
neighbor
Vodičkova                 1
Vojtěšský obvod           1
Štěpánský obvod-západ     1
Nuselské údolí            1
Vyšehrad                  1
Podskalí                  1
Zderaz                    1
Štěpánský obvod-východ    1
Folimanka-západ           1
Name: weight, dtype: int64

This returns a pandas.Series containing each neighbour’s ID codes as an index, with the weights assigned as values. Since you are looking at a raw Queen contiguity matrix, every neighbour gets a weight of one. If you want to access the weight of a specific neighbour, Zderaz for example, you can do recursive querying:

queen['Albertov']['Zderaz']
np.int64(1)

All of these relations can be easily visualised using Graph.explore() method you know from GeoPandas.

1m = prague.explore()
queen.explore(
2    prague, m=m, edge_kws=dict(style_kwds=dict(weight=1)), nodes=False
)
1
Use (optionally) polygons as the underlying layer providing context.
2
Graph itself represents only topological relations between geometries. To plot the graph on a map, you need to pass geometries representing graph nodes.
Make this Notebook Trusted to load map: File -> Trust Notebook