Learning GeoPandas
This section is about playing with geopandas
by yourself. You will zoom at Prague and the distribution of land price, provided by the Institute of Planning and Development.
Data Preparation
You will load two datasets. One with the price data and the other with the boundaries of municipal districts. Below are the links to both. Your first task is to figure out how to load them as two GeoDataFrames
, one called price
and the other called districts
.
You have two options: either download and read from disk or pass the URL of the actual file to geopandas.read_file
.
Once you have your price
dataset loaded, there is one cleaning step you need to do. The column with the price called "CENA"
is encoded as strings, not numbers. You need to replace the "N"
string with None
and convert the column to float
numbers.
"CENA"] = price["CENA"].replace("N", None).astype('float') price[
The rest is up to you!
Map Making
Create a map of price distribution with the overlay of the district boundaries. Start with static or interactive, and try replicating the other once you’re done if the time permits. A few requirements:
- Plot the price.
- Plot boundaries on top. Try different colours to get a nice combination of colours.
- Show a legend.
- Use CartoDB Voyager or CartoDB Dark Matter basemap.
- Can you figure out how to change the colormap?
- Can you change the transparency of polygons?
- Colormap is controlled via the
cmap=
keyword. - To plot a boundary, it may be easier to work with LineStrings than Polygons.
- Check the additional reading materials linked in the previous section to figure out how to add basemap to static plots.
- Transparency is different in static and interactive maps. Check the documentation!
Measuring
Practice using geometric methods and properties GeoDataFrame
offers.
- Create a convex hull around each polygon in
price
. - Calculate the area of these convex hulls.
- Find the convex hulls with an area between 40th a 60th percentile in the GeoDataFrame. Create a new object (e.g.
average
) only with them. - Create a multi-layer map of Prague where the smallest areas are coloured in one colour, and the rest appear in another. Try making it legible enough.
Joining
Join the two GeoDataFrame
using .sjoin()
or .overlay()
methods to get data per district.
- Is the mean price higher in Praha 3 or Praha 6?
- Which district is the cheapest?
- What is the difference between the cheapest and the most expensive one?
You may need to use .groupby()
after joining.
For those of you who don’t speak Czech, the district names are encoded in the "NAZEV_1"
column.