In this workshop, participants will learn how to use cutting-edge, AI-based tools for analyzing protein structure and function.
The workshop will start by exploring 3D protein structure prediction using AlphaFold for alignment-based structure prediction and ESMFold for single-sequence structure prediction. Participants will then learn how to use FoldSeek for structure-based protein similarity search. The last part of the workshop will bring all of these concepts together by using PanEffect to explore how genetic variations in protein sequence can influence an organism’s phenotype.
Tutorial Setup Instructions
Steps to prepare for the tutorial session:
-
Login to Atlas Open OnDemand at https://atlas-ood.hpc.msstate.edu/. For more information on login procedures for web-based SCINet access, see the SCINet access user guide.
-
Open a command-line session by clicking on “Clusters” -> “Atlas Shell Access” on the top menu. This will open a new tab with a command-line session on Atlas’ login node.
-
Request resources on a compute node by running the following command:
salloc --reservation=forum-gpu -A scinet_workshop1 -p gpu-a100-mig7 -n1 --gres=gpu:1 -A scinet_workshop1 -t 3:00:00
salloc: Granted job allocation <job-id>
salloc: Nodes atlas-0245 are ready for job
srun --jobid=<job-id> --pty bash
-
Create a workshop working directory and copy the workshop materials into it by running the following commands. Note: you do not have to edit the commands with your username as it will be determined by the $USER variable.
mkdir -p /90daydata/shared/$USER/ cd /90daydata/shared/$USER/ cp -r /project/ai_forum/protein_structure .
-
Stop the interactive job on the compute node by running the command exit.
Schedule
Materials | Start | Est. minutes | Topic | Presenter |
---|---|---|---|---|
Introduction | 1:30 PM | 10 minutes | Introduction | Hye-Seon Kim & Carson Andorf |
Protein Structure Prediction | 1:40 PM | 30 minutes | AlphaFold 2 & 3 | Hye-Seon Kim |
AlphaFold online | Hye-Seon Kim | |||
2:10 PM | 30 minutes | ESMFold | Carson Andorf | |
ESMFold online | Carson Andorf | |||
2:40 PM | 20 minutes | OmegaFold | Stephen Harding | |
Protein Structure Search | 3:00 PM | 30 minutes | FoldSeek | Olivia Haley |
FoldSeek Online | Stephen Harding | |||
Missense Variant Effect Predictions | 3:30 PM | 30 minutes | ESM-variant | Carson Andorf |
PanEffect (Fusarium) | Hye-Seon Kim | |||
PanEffect (Maize) | Carson Andorf | |||
Protein Binder Predictions | 4:00 PM | 30 minutes | RFdiffusion | Olivia Haley |
RFdiffusion online | Olivia Haley |
Additional Resources: