Is there space in machine learning?

Subsets of data science

Supervised

Classification problems

Regression problems

Unsupervised

Clustering

Dimensionality reduction

Models

Linear regression

Logistic regression

Decision trees

Random forest

Gradient-boosted trees

Neural networks

Workflow

Split to train and test parts

Fit the model

Evaluate

(data standardisation)
Hyper-parameter tuning
Data augmentation

A visual explanation

By R2D3 (Stephanie Yee and Tony Chu)

Evaluation methods

Regression

Residuals

\(R^2\)

Mean absolute error

Mean squared error

Spatial?

Explicitly spatial ML is rare.

Spatial dimension is squeezed in
non-spatial models and methods.

Data leakage

Spatial cross-validation

Spatial evaluation

Spatial patterns of errors

import sklearn