Jupyter

Welcome to the first hands-on section of the course. You will familiarise yourself with the tools you will be using, ensure that all works as it should, and prepare for some Python code.

Do you have Python ready?

Before you start, ensure you have carefully read and followed the instructions outlined in the Infrastructure section and have a working installation of Python and some Jupyter Notebook IDE, either via Jupyter Lab (recommended) or on Google Colab.

Shell

Spatial Data Science depends on code, and coding environments can be unfriendly to an average user. People designing the tools are often computer scientists or have a strong knowledge of CS-related environments. It means we sometimes need to deal with tools that look a bit scary, like a Terminal or a Command line. Below is a brief introduction to the tools you will need for this course.

Terminal and Command line

Depending on your operating system, you will have either Terminal (macOS, Linux) or Command Prompt application installed. It will look like this:

Terminal on macOS

Terminal (and Command Prompt or Command line, but we will refer to all as the terminal for simplicity) is used to interact with applications that do not have any graphic interface or with the apps that do have one, but you want to use them programmatically. The terminal usage is straightforward. Let’s start with a few examples.

  1. Create a folder (or maybe you already have) to store files for this course.
  2. Download this notebook by clicking on the Jupyter option on the right side of this page and move the notebook to the folder.

You want to see a list of files and folders you have in the folder. First, you need to navigate to the folder. For that, you can use the cd command, which stands for change directory.

cd courses/sds/

Let’s assume that you have the folder with course material in the folder called sds in another folder called courses. The full command is then composed of the cd part, saying set the current directory to… and waits for the parameter, which is a path to the folder in this case - courses/sds/.

Once in the correct folder, you can use another command, ls, which stands for list and allows you to list the contents of the current directory.

ls

The output would look like this:

Output of the ls command.

You can also pass a parameter -l, specifying that you want a long listing including attributes.

ls -l

That changes the output to this:

Output of the ls -l command.

The syntax is always the same, starting with the app name and then followed by parameters.

Jupyter Notebook and Jupyter Lab

While you can interact with Python from the terminal, it is inconvenient. Instead, you will use Jypyter Notebooks and Jupyter Lab. Jupyter Notebooks are documents that allow you to mix text and code, execute small pieces of code one at a time and include graphical outputs of your code. Jupyter Lab is a handy interface that allows you to work with multiple notebooks and switch between your Python environments created with Pixi.

It is time to say goodbye to the terminal and start Jupyter Lab.

First, avigate to the template folder. Then you can start Lab using:

pixi run jupyter lab

This command should open your browser and load the Jupyter Lab interface.

Jupyter Lab interface

In the launcher, we can create a new Notebook by clicking on the Python logo representing our current environment. If you have more of them, you will see them there, as well as other environments using different programming languages like R or Julia.

The notebook is composed of cells. This is a cell:

Jupyter Notebook cell

Cells can contain either code or text. A typical notebook is then a series of cells where some include text describing what is happening while others contain the code, either waiting for execution or already executed. The cells with the executed code may also contain outputs.

You can start with simple math that Python can do natively. Run the following code cell. To do that, you can either click the “play” button on top or hit Shift + Enter:

1 + 1
2

You now have a code cell with the output. Jupyter Lab automatically created a new cell. Change its type to Markdown and write a short text describing what the cell above does.

Markdown

Text cells can be formatted using the Markdown syntax. Markdown is a way of using plain text to write documents that can later be formatted by a rendering engine (Jupyter Lab is the rendering engine in this case). Unlike \(\LaTeX\), it is very lightweight.

In a cell of a Markdown type, you can:

- create
- simple
- lists
  • create
  • simple
  • lists

If you need numbering, you can:

1. create
2. numbered
3. lists
    1. or nested
    2. lists within lists
4. that may continue below
  1. create
  2. numbered
  3. lists
    1. or nested
    2. lists within lists
  4. that may continue below

You can also include links:

Wrap a word that shall be a [link](https://github.com) in square brackets followed by
round brackets with the URL.

Wrap a word that shall be a link in square brackets followed by round brackets with the URL.

Or include images, using a very similar syntax.

![Alt text describing the figure (optional)](https://imgs.xkcd.com/comics/standards.png)

Alt text describing the figure (optional)

Markdown has a few flavours, which are often compatible but some marks are unique to distinct flavours. You can therefore write italic using underscores _italic_ or using stars *italic* around the text you want to adapt. The same applies to bold, with two underscores or stars **bold**.

You can further include mathematical formulas using \(\LaTeX\). Either using the inline syntax $e = mc^2$ which renders as \(e = mc^2\) or as a proper formula:

$$
I = \frac N W \frac {\sum_{i=1}^N \sum_{j=1}^N w_{ij}(x_i-\bar x) (x_j-\bar x)} {\sum_{i=1}^N (x_i-\bar x)^2}
$$

\[ I = \frac N W \frac {\sum_{i=1}^N \sum_{j=1}^N w_{ij}(x_i-\bar x) (x_j-\bar x)} {\sum_{i=1}^N (x_i-\bar x)^2} \]

Markdown can also do tables, like this one:

| Language | Year |
| -------- | ---- |
| Python   | 1991 |
| R        | 1993 |
| Julia    | 2012 |
| Rust     | 2015 |
Language Year
Python 1991
R 1993
Julia 2012
Rust 2015

For structuring the documents, Markdown offers up to six levels of headings.

# Heading level 1

### Heading level 3

##### Heading level 5

Heading level 1

Heading level 3

Heading level 5

While you can also use basic HTML within Markdown (the cell is rendered in the browser anyway), please try to avoid doing so and stick to vanilla Markdown if possible. The inclusion of HTML tends to break some rendering engines and alter the expected visual style of the resulting document.

Notebook essentials

Jupyter Notebooks are great and have a lot of built-in features that can improve your quality of life.

Execution

While you can run the code cell by clicking on the ▶ button on the top, try creating muscle memory for Shift + Enter instead. You’ll be much faster.

Tab completion

Learn to use Tab. It provides you with all possible actions you can do after loading in a library and it is used for automatic autocompletion. Try hitting Tab while writing the code below.

import math

math.pi
3.141592653589793

As you have noticed, Jupyter gives you options on what you can import or use. It also remembers all the variables so you don’t have to type them next time.

variable_name_that_got_super_long = 42
1 + variable_name_that_got_super_long
43

Contextual help

Each Python function comes with documentation. You can fetch and display it directly in the Notebook using Shift + Tab. Try using it in the cell below.

math.sin(math.pi)
1.2246467991473532e-16

Cell mode

Each cell can be either in the edit mode or in the command mode. Edit mode allows you to change the contents of the cell. In the case of code cells, it looks nearly the same as the command mode but in the case of Markdown cells, edit mode shows the plain text with the Markdown syntax while the command mode shows the rendered text. You can enter the edit mode by hitting Enter.

Command mode moves you away from editing individual cells and allows you to use keyboard shortcuts to manipulate the notebook. You can enter the command mode by hitting Esc.

Shortcuts

When you are in command mode, you can use a wide range of keyboard shortcuts. For example:

  • A creates a new cell above the selected one.
  • B creates a new cell below the selected one.
  • C copies the selected cell, while V pastes it below.
  • D + D deletes the cell.
  • M changes the cell type to Markdown.
  • Y changes the cell type to code.
  • Ctrl + Shift + C / Cmd + Shift + C opens a command palette.

I am stuck!

Things may get stuck or break completely. When that happens:

  • first, try Kernel > Interrupt -> your cell should stop running
  • if that does not help -> Kernel > Restart -> restart your notebook
Python basics

This course won’t cover the very basics of Python. You may be able to catch it along the way but if you prefer more explicit material, have a look at A taste of Python section of the Geo-Python course by D. Whipp, H. Tenkanen, V. Heikinheimo, H. Aagesen, and C. Fink.

Acknowledgements

The Jupyter Notebook section is inspired by Data manipulation, analysis and visualisation in Python by Joris Van den Bossche and Stijn Van Hoey licensed under CC-BY 4.0.