See also Nan Nourn’s example code with expanded bioclim options for Lab 8 & Lab 9
This lab is a continuation of species distribution models (SDMs)
using the R package, dismo.
The package contains a great vignette and you worked through the first
part of it in Lab 8. You should complete Lab 8’s portion of the vignette
with your species of choice before starting on Lab 9.
For this lab, you will work through Part 2 (Chapters 5-7: Model fitting, prediction, and evaluation) of the dismo vignette. You can find the vignette here: https://rspatial.org/raster/sdm/index.html.
Note that if you want to use a species that does not occur in the dismo vignette bioclim variable range (Mexico, Central & South America), then you need to pull in environmental data from other regions. To do this please see the “Environmental Data” section of the additional code provided in Nan Nourn’s example code with expanded bioclim options for Lab 8 & Lab 9 - this is also linked in Lab 8.
In Ch. 4 of the vignette, you were introduced to environmental
predictors, and specifically bioclimatic variables. The dismo vignette
reads in WorldClim bioclimatic
variables. Worldclim interpolates temperature and precipitation from
weather stations to create a modeled representation of these data. Note
that you can create bioclimatic variables from any gridded climate data
source using the command biovars. For example, you could
use satellite remote sensing (direct measures instead of interpolated) -
NASA’s MODIS land
surface temperature and NASA’s GPM
(global precipitation measurement). Or you could use PRISM data if you’re working
in the United States.
The 19 Bioclimatic Variables are listed below. Metadata for these data can be found here.
UNITS: the units for temperature are an order of magnitude larger: °C x 10, and ‘mm’ is the unit for precipitation.
Work through Part 2 (Ch. 5-7: Model fitting, prediction, and evaluation) of the vignette with the example dataset. Choose a different species than the one provided in the vignette (different than the Bradypus species; it would be easiest to select the species you chose in Lab 8 since you already completed that section). Note that the extraction of the environmental predictors occurs in Chapter 4 of the vignette, so if you choose a different species than you used in Lab 8, you will need to go back to earlier portions of the vignette. Re-create Ch. 5-7 of the vignette with your species of interest, constraining the region of your study area to your species known range (ensure that you remove outliers from zoos, incorrect locations by looking up the known range from an independent source).
For more background information on AUC and Spatial Sorting Bias, see the additional text below the QUESTIONS.
NOTE 1: A reminder that if you want to use data from WorldClim outside the dismo vignette data, you should download it; see the “Environmental Data” section of the additional code provided in Nan Nourn’s example code with expanded bioclim options for Lab 8 & Lab 9 - this is also linked in Lab 8.
It is possible that the dismo vignette is using older versions of WorldClim data (version 1.4). The version 1.4 data are outdated so you wouldn’t want to publish anything with them. See above for how to obtain the updated data (2.1). For the lab exercises, it’s ok to use the 1.4 version; you can download 2.1 version data from another area like this:
# Create the folders (directories) "data" and "lab9" - If they exist already, this command won't over-write them.
library(geodata) # package to access geographic data
?geodata
# Use the worldclim_global command
# World-wide, all bioclim variables, 10 minutes of a degree resolution
w_data_world<-geodata::worldclim_global(var="bio",res =10,version = 2.1)
plot(w_data_world)

# Region-specific (lon and lat are centered on the area); 0.5 minutes of a degree resolution
w_data_europe<-geodata::worldclim_tile(var="bio",lon=5, lat=45,res =.5,version = 2.1)
plot(w_data_europe)

NOTE 2: In Ch. 5-7 of the vignette, when creating the reduced model with a subset of bioclimatic variables (e.g., bio1, bio5, bio12), you can choose to select different bioclimatic variables - see the list of bioclimatic variables above. You should choose a different subset only if the results of your full model (containing the fuller set of bioclimatic variables) suggest a different set, or if you have a priori knowledge of the species-climate relationship.
Show your work (code and output) for this portion of the vignette (Chapters 5-7). You can either add on to Lab 8’s .Rmd file of the vignette that you already produced to create a longer PDF or HTML, or you can hand in a new PDF or HTML with just Lab 9’s section included. Either way, please hand in a PDF or HTML produced from your .Rmd file to show all your work for Ch. 5-7.
After completing the vignette with your species of choice, answer the following QUESTIONS:
Describe the differences you observe in the mapped prediction between the full model, and the reduced model. Which mapped prediction is closer to the known distribution of the species? Provide a source and image (e.g., screenshot) of an independent source showing the range map of the species.
Which model, the full or reduced model, performs better? Why? Include plots and model statistics to help with your explanation.
Extra information on AUC and Spatial Sorting Bias
(SSB): In general AUC improves with larger geographic extents.
As with most of these metrics, you should only compare them across
models that have the same input data (meaning the same records for
occurrences; same number of rows of unique presence and absence
locations). Splitting data into training and testing makes the model
susceptible to being influenced by the actual distances between the
testing and training data in space. To remove this bias, you can subset
the data into training and testing with pairwise distance sampling
(pwdSample in dismo package).
Type the following to see the full details on the
pwdSample command:
library(dismo)
?pwdSample
This work is licensed under a
Licensed under CC-BY 4.0 2025 by Phoebe Zarnetske . ```