In this lab, the focus is on open access data (a.k.a. publicly available data). Open access data are vital to moving science forward — both in terms of advancing basic science and applied science. Open data are also essential for making decisions based on the “best available science”, including informing policy and management.
In Lab 1, you were introduced to the FAIR Data Principles — the idea that data should be:
Wilkinson et al. (2016) outlines these standards:
Wilkinson, M. D., M. Dumontier, Ij. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, P. E. Bourne, J. Bouwman, A. J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds, C. T. Evelo, R. Finkers, A. Gonzalez-Beltran, A. J. G. Gray, P. Groth, C. Goble, J. S. Grethe, J. Heringa, P. A. C. ’t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S. J. Lusher, M. E. Martone, A. Mons, A. L. Packer, B. Persson, P. Rocca-Serra, M. Roos, R. van Schaik, S.-A. Sansone, E. Schultes, T. Sengstag, T. Slater, G. Strawn, M. A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft, J. Zhao, and B. Mons. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3:160018. DOI:10.1038/sdata.2016.18.
Please review these principles as you will need to address them in
your write-up.
You can also review the FAIR Principles at this website: https://www.go-fair.org/fair-principles/
Interested in contributing to open access data?
Citizen Science is another area that has been growing
in popularity and utility for gaining data across space and time. These
data come with their own considerations as they are not systematically
collected.
Check out:
These are two well-established citizen science datasets. Consider joining as a user! The records contributed by citizen scientists that are considered “research grade” are incorporated into GBIF.
For this lab assignment, you will summarize a publicly available spatially-explicit dataset (i.e., data are geographically referenced) using the RMarkdown template below.
We suggest you find a dataset related to your graduate work and planned paper for the course as this will help you explore potential data for your analyses.
You will need to hand in a PDF produced from R Markdown, including any images, plots, code, and text for answers to the questions.
Please find a different open access dataset.
Our bioXgeo
research group has also compiled a list of abiotic satellite
remotely-sensed data that could be useful for your
research:
https://bioxgeo.github.io/bioXgeo_ProductsTable/
1. Data Category:
Species Occurrences
2. Data Description:
GBIF is the Global Biodiversity Information Facility which provides free
and open access to biodiversity data.
Spatial and temporal scale of occurrences varies depending on the
species of interest.
3. Data Link:
http://www.gbif.org/
4. Image/icon/logo for these data:
5. Data Use Policy:
https://www.gbif.org/data-use
6. How these Data align with the FAIR Data
Principles:
[Insert a few sentences evaluating how well these data meet the
Findable, Accessible, Interoperable, and Reusable (FAIR)
Principles.]
7. Use with R:
GBIF data on species occurrences can be downloaded directly for certain
species using the R package dismo
, as
shown below for Indri indri (the largest living lemur).
For use with software other than R, select filters to download
occurrence data here:
http://www.gbif.org/occurrence/search
Indri (Indri indri), Analamazaotra Special Reserve, Madagascar. Phoebe L. Zarnetske 2012.
# install.packages("geodata") # install packages locally before knitting
library(dismo) # package to easily download gbif occurrence data
library(geodata) # package to access geographic data
## Warning: package 'geodata' was built under R version 4.4.3
##
## Attaching package: 'geodata'
## The following object is masked from 'package:fields':
##
## world
?geodata
# Load any additional packages needed for plotting.
indri.gbif <- gbif("Indri", "indri", geo = TRUE)
## 2648 records found
## 0-300-600-900-1200-1500-1800-2100-2400-2648 records downloaded
# Administrative boundaries for Madagascar (MDG):
# gadm function using geodata package. Also see: http://www.gadm.org/
mada <- gadm(country = "MDG", level = 0)
plot(mada, main = "Indri indri GBIF occurrence records")
points(indri.gbif$lon, indri.gbif$lat, pch = 19, cex = 0.5, col = "blue")
# Continue plotting to add a north arrow, scalebar, and axes for the plot.
This work is licensed under a
Licensed under CC-BY 4.0 2025 by Phoebe Zarnetske . ```