Quiz on classification

Check how much you remember from previous sections by answering the questions below.

Which of the following statements best describes the difference between a Random Forest Regressor and a Random Forest Classifier?

✓A Random Forest Regressor is used for continuous values, whereas a Random Forest Classifier is used for discrete categories.

✗A Random Forest Regressor predicts discrete categories, while a Random Forest Classifier predicts continuous values.

✗Both Random Forest Regressor and Classifier handle categorical data only.

✗A Random Forest Classifier only outputs probability distributions.

Cohen’s Kappa is used to measure the agreement between predicted and actual classifications. What is a major advantage of using Kappa over simple accuracy?

✓Kappa accounts for imbalances in class distributions by considering chance agreement.

✗Kappa only focuses on true positives and ignores false negatives.

✗Kappa automatically adjusts for overfitting in the model.

✗Kappa increases as the number of classes increases, regardless of model performance.

Which of the following is a correct method for performing spatial cross-validation in a spatial machine learning model?

✗Randomly split the dataset into training and test sets without considering the geographic locations of data points.

✗Use K-fold cross-validation by randomly assigning data points to folds, regardless of spatial proximity.

✓Ensure that training and test sets are drawn from spatially distinct regions, avoiding overlap between geographically close observations.

✗Use a single train-test split based on temporal ordering rather than spatial separation.

Which of the following is an example of feature engineering in spatial data analysis?

✗Using train_test_split to split your dataset.

✗Visualizing the distribution of data points on a map.

✗Reducing the number of independent variables to improve model training speed.

✓Measuring the distance between geographic points to generate a proximity variable.

What does spatial dependence in a dataset imply?

✗The model predictions in one region depend on the values from distant, unrelated regions.

✗Spatial dependence means that the dataset is temporally autocorrelated.

✓Observations that are closer together tend to have similar values due to their spatial proximity.

✗The closer two points are, the less likely they are to have similar values.

Which of the following best defines spatial heterogeneity?

✗The tendency for variables to be uniformly distributed across space.

✓The variation in relationships between variables in different locations.

✗The spatial uniformity in the feature importance of a model.

✗The lack of correlation between variables at different locations.