Quiz on classification
Check how much you remember from previous sections by answering the questions below.
Which of the following statements best describes the difference between a Random Forest Regressor and a Random Forest Classifier?
✓A Random Forest Regressor is used for continuous values, whereas a Random Forest Classifier is used for discrete categories.
✗A Random Forest Regressor predicts discrete categories, while a Random Forest Classifier predicts continuous values.
✗Both Random Forest Regressor and Classifier handle categorical data only.
✗A Random Forest Classifier only outputs probability distributions.
Cohen’s Kappa is used to measure the agreement between predicted and actual classifications. What is a major advantage of using Kappa over simple accuracy?
✓Kappa accounts for imbalances in class distributions by considering chance agreement.
✗Kappa only focuses on true positives and ignores false negatives.
✗Kappa automatically adjusts for overfitting in the model.
✗Kappa increases as the number of classes increases, regardless of model performance.
Which of the following is a correct method for performing spatial cross-validation in a spatial machine learning model?
✗Randomly split the dataset into training and test sets without considering the geographic locations of data points.
✗Use K-fold cross-validation by randomly assigning data points to folds, regardless of spatial proximity.
✓Ensure that training and test sets are drawn from spatially distinct regions, avoiding overlap between geographically close observations.
✗Use a single train-test split based on temporal ordering rather than spatial separation.
Which of the following is an example of feature engineering in spatial data analysis?
✗Using train_test_split
to split your dataset.
✗Visualizing the distribution of data points on a map.
✗Reducing the number of independent variables to improve model training speed.
✓Measuring the distance between geographic points to generate a proximity variable.
What does spatial dependence in a dataset imply?
✗The model predictions in one region depend on the values from distant, unrelated regions.
✗Spatial dependence means that the dataset is temporally autocorrelated.
✓Observations that are closer together tend to have similar values due to their spatial proximity.
✗The closer two points are, the less likely they are to have similar values.
Which of the following best defines spatial heterogeneity?
✗The tendency for variables to be uniformly distributed across space.
✓The variation in relationships between variables in different locations.
✗The spatial uniformity in the feature importance of a model.
✗The lack of correlation between variables at different locations.