A note on Spatial Data Science across Languages, vol.1

I am sitting on a train back to Prague after two days of discussing tooling for spatial data science available in the Python, R and Julia ecosystems, with occasional excursions to the worlds of Rust, JavaScript or ESRI. I am coming back from the Spatial Data Science across Languages (SDSL) workshop and I’d like to share a few thoughts1 while they’re fresh. Different maturity of ecosystems As a Python developer, I must admit that what the R-Spatial community managed to create is impressive and is in some aspects further that where we are....

September 20, 2023 · 4 min

Writing an efficient code for GeoPandas and Shapely in 2023

With the release of Shapely 2.0, the GeoPandas-based code that have been optimised years ago may no longer provide the best performance. The workshop organised during the GeoPython 2023 together with Joris van den Bossche showed how to change that and write efficient and convenient GeoPandas code that uses the benefits of the latest developments in the Python geospatial ecosystem. Workshop resources are available on Github. Annotation The Python geospatial ecosystem is constantly evolving, rushing towards better usability, new features, fewer bugs and increasing performance....

May 2, 2023 · 2 min

Introduction to GeoPandas and its Python ecosystem

A talk from the OpenGeoHub Summer School 2022. Workshop materials Recording The ecosystem of packages for spatial data handling and analysis in Python is extensive and covers both vector and raster analytics from small to large distributed data. This talk covers only a small part, focusing on vector data processing with GeoPandas at its core. First, it covers what GeoPandas is and how it relates to other packages and combines them into a user-friendly API....

October 20, 2022 · 1 min

Understanding the structure of cities through the lens of data

The workshop organised together with James D. Gaboardi during the Spatial Data Science Symposium 2022 is now available online. See the recording below and access the workshop material on Github from which you can even run the code online, in your browser. Annotation Martin & James will walk you through the fundamentals of analysis of the structure of cities. You will learn what can be measured, how to do that in Python, and how to use those numbers to capture the patterns that make up cities....

September 29, 2022 · 1 min

Scaling up vector analysis with Dask-GeoPandas

The workshop organised during the GeoPython 2022 together with Joris van den Bossche introduces the Dask-GeoPandas library and walks you through its key components, allowing you to take a GeoPandas workflow and run it in parallel, out-of-core and even distributed on a remote cluster. Workshop resources are available on Github. Annotation The geospatial Python ecosystem provides a nice set of tools for working with vector data, including Shapely for geometry operations and GeoPandas to work with tabular data (and many other packages for IO, visualization, domain specific processing, …)....

June 22, 2022 · 2 min

Introducing Dask-GeoPandas for scalable spatial analysis in Python

Using Python for data science is usually a great experience, but if you’ve ever worked with pandas or GeoPandas, you may have noticed that they use only a single core of your processor. Especially on larger machines, that is a bit of a sad situation. Developers came up with many solutions to scale pandas, but the one that seems to take the lead is Dask. Dask (specifically dask.dataframe as Dask can do much more) creates a partitioned data frame, where each partition is a single pandas....

March 31, 2022 · 3 min

Capturing the Structure of Cities with Data Science

During the Spatial Data Science Conference 2021, I had a chance to deliver a workshop illustrating the application of PySAL and momepy in understanding the structure of cities. The recording is now available for everyone. The materials are available on my GitHub and you can even run the whole notebook in your browser using the MyBinder service.

November 2, 2021 · 1 min

xyzservices: a unified source of XYZ tile providers in Python

A Python ecosystem offers numerous tools for the visualisation of data on a map. A lot of them depend on XYZ tiles, providing a base map layer, either from OpenStreetMap, satellite or other sources. The issue is that each package that offers XYZ support manages its own list of supported providers. We have built xyzservices package to support any Python library making use of XYZ tiles. I’ll try to explain the rationale why we did that, without going into the details of the package....

August 3, 2021 · 4 min

Evolution of Urban Patterns: Urban Morphology as an Open Reproducible Data Science

We have a new paper published in the Geographical Analysis on the opportunities current developments in geographic data science within the Python ecosystem offer to urban morphology. To sum up - there’s a lot to play with and if you’re interested in the quantification of urban form, there’s no better choice for you at the moment. Urban morphology (study of urban form) is historically a qualitative discipline that only recently expands into more data science-ish waters....

July 15, 2021 · 4 min

Clustergam: visualisation of cluster analysis

In this post, I introduce a new Python package to generate clustergrams from clustering solutions. The library has been developed as part of the Urban Grammar research project, and it is compatible with scikit-learn and GPU-enabled libraries such as cuML or cuDF within RAPIDS.AI. When we want to do some cluster analysis to identify groups in our data, we often use algorithms like K-Means, which require the specification of a number of clusters....

April 27, 2021 · 7 min