- Provided by: Geospatial Research Working Group
Learning objectives
This session will include tutorials exploring examples of handling geospatial data, performing geospatial calculations, and applying parallel processing approaches to geospatial processing workflows in python. JupyterLab via Open OnDemand (see Session 4) will be used for a portion of the tutorials.
- Read in and manipulate raster data with the rioxarray package
- Read in and manipulate vector data with the geopandas package
- Time chunks of code in your python script
- Identify package functions with parallelization options built-in
- Parallelize python code of many independent geospatial tasks
Agenda
This session will be an interactive tutorial:
- Geospatial packages
- Parallel processing packages
- Vector tutorial
- Raster tutorial
- Vector-raster tutorial
Tutorial material
Watch a recording of this tutorial
Written versions of these tutorials, modified to be accessible to any SCINet user, are available on the Geospatial Workbook
The workshop-specific instructions are kept below.
Steps to prepare for the tutorial:
-
Login to Ceres Open OnDemand at https://ceres-ood.scinet.usda.gov. Your username is typically firstname.lastname. For the password, enter your SCINet account password followed by the 6-digit verification code, e.g. from a Google Authenticator app on your phone, with no spaces. Do not add a ‘+’ between your password and code.
-
Copy the Session 6-8 material from the workshop project space to your temporary workshop folder. The contents of this session have been added to the Session 6 folder since they share the same data. To get to a shell to do the copying, you can use the Clusters tab at the top of your Open OnDemand page to select ‘Ceres Shell Access’ (if prompted for a password, enter your SCINet account password without the verification code). If you are comfortable ssh-ing in instead from terminal or powershell, feel free to do so.
If you have already made your workshop folder in previous sessions, you will only need to run the following commands, replacing firstname.lastname with your actual name:
cd /90daydata/shared/firstname.lastname cp -r /project/geospatialworkshop/session6/ . module load miniconda source activate /project/geospatialworkshop/gwenv ipython kernel install --user --name=grwg_workshop
If you have not created your workshop folder yet, run these commands instead, replacing firstname.lastname with your actual name:
cd /90daydata/shared mkdir firstname.lastname cd firstname.lastname cp -r /project/geospatialworkshop/session6/ . module load miniconda source activate /project/geospatialworkshop/gwenv ipython kernel install --user --name=grwg_workshop
-
Launch a JupyterLab session. Choose the following values from the menu:
- Account: geospatialworkshop
- Slurm Partition: workshop
- Number of hours: 3
- Number of cores: 16
- Jupyter Notebook vs Lab: Lab
- Working Directory: /90daydata/shared/firstname.lastname
Click Launch.
-
The tutorials: Two tutorials will follow python notebooks in JupyterLab. For the third tutorial, we will submit a job to SLURM directly. If your shell from Step 2 has expired when we start this tutorial, please reconnect, and change directory to your session 6 folder:
cd /90daydata/shared/firstname.lastname/session6