Quiz on classification

Check how much you remember from previous sections by answering the questions below.

Which of the following statements best describes the difference between a Random Forest Regressor and a Random Forest Classifier?

A Random Forest Regressor is used for continuous values, whereas a Random Forest Classifier is used for discrete categories.

A Random Forest Regressor predicts discrete categories, while a Random Forest Classifier predicts continuous values.

Both Random Forest Regressor and Classifier handle categorical data only.

A Random Forest Classifier only outputs probability distributions.

Cohen’s Kappa is used to measure the agreement between predicted and actual classifications. What is a major advantage of using Kappa over simple accuracy?

Kappa accounts for imbalances in class distributions by considering chance agreement.

Kappa only focuses on true positives and ignores false negatives.

Kappa automatically adjusts for overfitting in the model.

Kappa increases as the number of classes increases, regardless of model performance.

Which of the following is a correct method for performing spatial cross-validation in a spatial machine learning model?

Randomly split the dataset into training and test sets without considering the geographic locations of data points.

Use K-fold cross-validation by randomly assigning data points to folds, regardless of spatial proximity.

Ensure that training and test sets are drawn from spatially distinct regions, avoiding overlap between geographically close observations.

Use a single train-test split based on temporal ordering rather than spatial separation.

Which of the following is an example of feature engineering in spatial data analysis?

Using train_test_split to split your dataset.

Visualizing the distribution of data points on a map.

Reducing the number of independent variables to improve model training speed.

Measuring the distance between geographic points to generate a proximity variable.

What does spatial dependence in a dataset imply?

The model predictions in one region depend on the values from distant, unrelated regions.

Spatial dependence means that the dataset is temporally autocorrelated.

Observations that are closer together tend to have similar values due to their spatial proximity.

The closer two points are, the less likely they are to have similar values.

Which of the following best defines spatial heterogeneity?

The tendency for variables to be uniformly distributed across space.

The variation in relationships between variables in different locations.

The spatial uniformity in the feature importance of a model.

The lack of correlation between variables at different locations.