Large language models (LLMs), a key technology behind well-known tools like ChatGPT and Microsoft Copilot, have a multitude of applications in agricultural research. SCINet’s high-performance computing resources provide an excellent environment for research use of LLMs, and this workshop will equip you with the knowledge and tools you need to take advantage of LLM capabilities in your research.
We will explore a variety of use cases, from basic “chat” interactions to more advanced applications, including information retrieval and summarization and high-throughput automation. Along the way, we will learn about how LLMs and related technologies actually work, which will help you make well-informed decisions about how to best use them for your research.
Ultimately, this workshop will put you in the driver’s seat: you will be able to decide which models are most appropriate and how to use them, all while ensuring that you remain in full control of your data and results and in compliance with research security requirements.
There are no prerequisites for the first day of the workshop when we will cover introductory concepts and graphical user interfaces, but there are prerequisites for the second day when we will cover automating LLM queries. You may attend only the first day if you wish. If you attend the second day as well, the prerequisites are:
- A basic understanding of Python.
- An understanding of file system organization and file paths.
- Basic command-line skills. E.g., know how to use
ls,cd, and run programs from the command line.
Tutorial setup instructions
Steps to prepare for the tutorial:
Day 1
- Login to Atlas Open OnDemand at https://atlas-ood.hpc.msstate.edu/. For more information on login procedures for web-based SCINet access, see the SCINet access user guide.
- Open a command-line session by clicking on “Clusters” -> “Atlas Shell Access” on the top menu. This will open a new tab with a command-line session on Atlas’s login node.
- Request resources on a compute node by running the following command:
srun -A scinet_workshop2 -t 00:30:00 -n 1 --mem 8G --pty bash- For those accessing post-workshop: Change
-A scinet_workshop2to your own project account. For additional information, see our SLURM guide.
- For those accessing post-workshop: Change
- Create your workshop working directory and copy the tutorial materials into it by running the following commands. Note: you do not have to edit the commands with your username as it will be determined by the
$USERvariable.mkdir -p /90daydata/shared/$USER/llms_on_scinet cd /90daydata/shared/$USER/llms_on_scinet cp /project/scinet_workshop2/llms_on_scinet/coding_assistance_jupyterlab.ipynb . - Setup the kernel for JupyterLab. You will create a kernel called venv_llms to access from JupyterLab Server. Run the following commands to activate the workshop’s virtual environment and create a new kernelspec from it:
source /project/scinet_workshop2/llms_on_scinet/venv_llms/bin/activate ipython kernel install --name "venv_llms" --user - Stop the interactive job on the compute node by running the command:
exit - Launch a JupyterLab Server – AI Workshop session. Under the Interactive Apps menu, select JupyterLab Server – AI Workshop.
- Specify the following input values on the page:
- Partition: atlas
- Account: scinet_workshop2
- QoS: normal
- Number of hours: 3
- Number of nodes: 1
- Number of tasks: 1
- Additional Slurm Parameters:
--reservation=workshop_llm1- For those accessing post-workshop: Remove
--reservation=workshop_llm1. For additional information, see our SLURM guide.
- For those accessing post-workshop: Remove
- Working Directory:
/90daydata/shared/${USER}/llms_on_scinet
- Click Launch. The screen will update to the Interactive Sessions page. When your Jupyter session is ready, the top card will update from Queued to Running and a Connect to JupyterLab Server button will appear. Click Connect to JupyterLab Server.
Day 2
-
Login to Atlas Open OnDemand at https://atlas-ood.hpc.msstate.edu/. For more information on login procedures for web-based SCINet access, see the SCINet access user guide.
-
Launch a JupyterLab Server – AI Workshop session. Under the Interactive Apps menu, select JupyterLab Server – AI Workshop.
- Specify the following input values on the page:
- Partition: gpu-a100-mig7
- Account: scinet_workshop2
- QoS: normal
- Number of hours: 4
- Number of nodes: 1
- Number of tasks: 3
- Additional Slurm Parameters:
--reservation=workshop_llm2 --mem=32GB --gres=gpu:1- For those accessing post-workshop: Remove
--reservation=workshop_llm2. For additional information, see our SLURM guide.
- For those accessing post-workshop: Remove
- Working Directory:
/90daydata/shared/${USER}/llms_on_scinet
- Click Launch. The screen will update to the Interactive Sessions page. When your Jupyter session is ready, the top card will update from Queued to Running and a Connect to JupyterLab Server button will appear. Click Connect to JupyterLab Server.
- Copy the tutorial materials by opening a terminal within JupyterLab (open the “File” menu, then “New” -> “Terminal”) and running the following commands. Note: you do not have to edit the commands with your username as it will be determined by the
$USERvariable.cd /90daydata/shared/$USER/llms_on_scinet cp /project/scinet_workshop2/llms_on_scinet/automating_llm_queries.ipynb .