Clustergam: visualisation of cluster analysis

In this post, I introduce a new Python package to generate clustergrams from clustering solutions. The library has been developed as part of the Urban Grammar research project, and it is compatible with scikit-learn and GPU-enabled libraries such as cuML or cuDF within RAPIDS.AI. When we want to do some cluster analysis to identify groups in our data, we often use algorithms like K-Means, which require the specification of a number of clusters....

April 27, 2021 · 7 min

3 - 10 = 65529. What?

Yes, the formula above is correct. Well, it depends on what we mean by correct. NDVI does not make sense Imagine the following situation. We have fetched a cloud-free mosaic of Sentinel 2 satellite data and want to measure NDVI (Normalised difference vegetation index), which uses red and near-infrared bands within this simple formula. NDVI = (NIR - Red) / (NIR + Red) The results are normalised, which in this case means that they lie between -1 and 1....

January 16, 2021 · 2 min

The journey of an algorithm from QGIS to GeoPandas

This is a short story of one open-source algorithm and its journey from QGIS to mapclassify, to be used within GeoPandas. I am writing it to illustrate the flow within the open-source community because even though this happens all the time, we normally don’t talk about it. And we should. The story Sometimes last year, I asked myself a question. How hard would it be to port topological colouring tool from QGIS to be used with GeoPandas?...

June 21, 2020 · 4 min

Line simplification algorithms

Sometimes our lines and polygons are way too complicated for the purpose. Let’s say that we have a beautiful shape of Europe, and we want to make an interactive online map using that shape. Soon we’ll figure out that the polygon has too many points, it takes ages to load, it consumes a lot of memory and, in the end, we don’t even see the full detail. To make things easier, we decide to simplify my polygon....

April 27, 2020 · 5 min