Session 8: Geospatial analyses and how to parallelize them in python
- Green Light, Red Light - Use the Zoom participant feedback indicators to show us if you are following along successfully as well as when you need help. To access participant feed back, click on the “Participants” icon to open the participants pane/window. Click the green “yes” to indicate that you are following along successfully, click the red “no” to indicate when you need help. Ideally, you will have either the red or green indicator displayed for yourself throughout the entire tutorial. We will pause every so often to work through solutions for participants displaying a red light.
- Chat questions/comments take first priority - Chat your question/comments either to everyone (preferred) or to the chat moderator (Ryan Lucas) privately to have your question/comment read out loud anonymously. We will answer chat questions first and call on people who have written in the chat before we take questions from raised hands.
- Share your video when speaking - If your internet plan/connectivity allows, please share your video when speaking.
- Keep yourself on mute - Please mute yourself when not speaking.
This session will include tutorials exploring examples of handling geospatial data, performing geospatial calculations, and applying parallel processing approaches to geospatial processing workflows in python. JupyterLab via Open OnDemand (see Session 4) will be used for a portion of the tutorials.
- Read in and manipulate raster data with the rioxarray package
- Read in and manipulate vector data with the geopandas package
- Time chunks of code in your python script
- Identify package functions with parallelization options built-in
- Parallelize python code of many independent geospatial tasks
This session will be an interactive tutorial:
- Geospatial packages
- Parallel processing packages
- Vector tutorial
- Raster tutorial
- Vector-raster tutorial
Written versions of these tutorials, modified to be accessible to any SCINet user, are available on the Geospatial Workbook
The workshop-specific instructions are kept below.
Steps to prepare for the tutorial:
Login to Ceres Open OnDemand at https://ceres-ood.scinet.usda.gov. Your username is typically firstname.lastname. For the password, enter your SCINet account password followed by the 6-digit verification code, e.g. from a Google Authenticator app on your phone, with no spaces. Do not add a ‘+’ between your password and code.
Copy the Session 6-8 material from the workshop project space to your temporary workshop folder. The contents of this session have been added to the Session 6 folder since they share the same data. To get to a shell to do the copying, you can use the Clusters tab at the top of your Open OnDemand page to select ‘Ceres Shell Access’ (if prompted for a password, enter your SCINet account password without the verification code). If you are comfortable ssh-ing in instead from terminal or powershell, feel free to do so.
If you have already made your workshop folder in previous sessions, you will only need to run the following commands, replacing firstname.lastname with your actual name:
cd /90daydata/shared/firstname.lastname cp -r /project/geospatialworkshop/session6/ . module load miniconda source activate /project/geospatialworkshop/gwenv ipython kernel install --user --name=grwg_workshop
If you have not created your workshop folder yet, run these commands instead, replacing firstname.lastname with your actual name:
cd /90daydata/shared mkdir firstname.lastname cd firstname.lastname cp -r /project/geospatialworkshop/session6/ . module load miniconda source activate /project/geospatialworkshop/gwenv ipython kernel install --user --name=grwg_workshop
Launch a JupyterLab session. Choose the following values from the menu:
- Account: geospatialworkshop
- Slurm Partition: workshop
- Number of hours: 3
- Number of cores: 16
- Jupyter Notebook vs Lab: Lab
- Working Directory: /90daydata/shared/firstname.lastname
The tutorials: Two tutorials will follow python notebooks in JupyterLab. For the third tutorial, we will submit a job to SLURM directly. If your shell from Step 2 has expired when we start this tutorial, please reconnect, and change directory to your session 6 folder: