Introducing Dask-GeoPandas for scalable spatial analysis in Python

Using Python for data science is usually a great experience, but if you’ve ever worked with pandas or GeoPandas, you may have noticed that they use only a single core of your processor. Especially on larger machines, that is a bit of a sad situation. Developers came up with many solutions to scale pandas, but the one that seems to take the lead is Dask. Dask (specifically dask.dataframe as Dask can do much more) creates a partitioned data frame, where each partition is a single pandas....

March 31, 2022 · 3 min

Capturing the Structure of Cities with Data Science

During the Spatial Data Science Conference 2021, I had a chance to deliver a workshop illustrating the application of PySAL and momepy in understanding the structure of cities. The recording is now available for everyone. The materials are available on my GitHub and you can even run the whole notebook in your browser using the MyBinder service.

November 2, 2021 · 1 min

xyzservices: a unified source of XYZ tile providers in Python

A Python ecosystem offers numerous tools for the visualisation of data on a map. A lot of them depend on XYZ tiles, providing a base map layer, either from OpenStreetMap, satellite or other sources. The issue is that each package that offers XYZ support manages its own list of supported providers. We have built xyzservices package to support any Python library making use of XYZ tiles. I’ll try to explain the rationale why we did that, without going into the details of the package....

August 3, 2021 · 4 min

Evolution of Urban Patterns: Urban Morphology as an Open Reproducible Data Science

We have a new paper published in the Geographical Analysis on the opportunities current developments in geographic data science within the Python ecosystem offer to urban morphology. To sum up - there’s a lot to play with and if you’re interested in the quantification of urban form, there’s no better choice for you at the moment. Urban morphology (study of urban form) is historically a qualitative discipline that only recently expands into more data science-ish waters....

July 15, 2021 · 4 min

Clustergam: visualisation of cluster analysis

In this post, I introduce a new Python package to generate clustergrams from clustering solutions. The library has been developed as part of the Urban Grammar research project, and it is compatible with scikit-learn and GPU-enabled libraries such as cuML or cuDF within RAPIDS.AI. When we want to do some cluster analysis to identify groups in our data, we often use algorithms like K-Means, which require the specification of a number of clusters....

April 27, 2021 · 7 min

3 - 10 = 65529. What?

Yes, the formula above is correct. Well, it depends on what we mean by correct. NDVI does not make sense Imagine the following situation. We have fetched a cloud-free mosaic of Sentinel 2 satellite data and want to measure NDVI (Normalised difference vegetation index), which uses red and near-infrared bands within this simple formula. NDVI = (NIR - Red) / (NIR + Red) The results are normalised, which in this case means that they lie between -1 and 1....

January 16, 2021 · 2 min

The journey of an algorithm from QGIS to GeoPandas

This is a short story of one open-source algorithm and its journey from QGIS to mapclassify, to be used within GeoPandas. I am writing it to illustrate the flow within the open-source community because even though this happens all the time, we normally don’t talk about it. And we should. The story Sometimes last year, I asked myself a question. How hard would it be to port topological colouring tool from QGIS to be used with GeoPandas?...

June 21, 2020 · 4 min

Line simplification algorithms

Sometimes our lines and polygons are way too complicated for the purpose. Let’s say that we have a beautiful shape of Europe, and we want to make an interactive online map using that shape. Soon we’ll figure out that the polygon has too many points, it takes ages to load, it consumes a lot of memory and, in the end, we don’t even see the full detail. To make things easier, we decide to simplify my polygon....

April 27, 2020 · 5 min