Research Spotlight
Machine learning-based prediction of cereal rye cover crop biomass across diverse agroecosystems
Utsab Ghimire1 and Eunjin Han2
1Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
2Adaptive Cropping Systems Laboratory, U.S. Department of Agriculture - ARS, Beltsville, MD, USA
Cereal rye is a widely used cover crop because it reduces soil erosion, scavenges excess nutrients, and suppresses weeds, all of which support soil health and regenerative agriculture. These benefits depend on the crop’s ability to produce substantial biomass. Accurately predicting rye biomass before termination can help growers plan fertilizer and herbicide applications and make informed decisions about planting the next cash crop.
Researchers at the USDA-ARS Adaptive Cropping Systems Laboratory are developing computer models to predict cereal rye biomass and plant quality, including the nitrogen:carbon ratio that influences nitrogen release to subsequent crops. A process-based model developed by Wang et al. (2021) simulates rye growth on an hourly basis, including phenology, photosynthesis, and soil water and nutrient uptake. While this model provides detailed insight into crop responses to environmental conditions, its computational requirements limit operational use for real-time decision making.
To complement this approach, the team is now creating more computationally efficient, data-driven machine learning (ML) models for practical use. Utsab Ghimire, a 2025 ARS AI-COE/SCINet graduate student intern from the University of Florida, made substantial contributions to accelerating development of these ML-based biomass prediction models. One key challenge was the limited field biomass data available for training, since biomass is typically measured only once or twice per season. However, a new dataset published by Huddell et al. in 2024 greatly improved this situation by providing 5,695 biomass measurements across 208 site-years (2001–2022) from the eastern U.S. After quality control, aggregation of field replicates, and removal of anomalies, 645 observations were available for modeling.
To capture the main drivers of rye biomass, the team evaluated climate variables, soil properties, initial soil fertility, and multiple global satellite-based datasets. These included gross primary production estimated from solar-induced chlorophyll fluorescence (NASA OCO2), leaf area index from NASA MODIS, and rootzone soil moisture from the Global Land Evaporation Amsterdam Model. Correlation analysis was used to select the most informative features and avoid multicollinearity. SCINet’s Ceres supercomputer was used to process the global dataset.
Using these inputs, the research team developed ML biomass prediction models with CatBoost and XGBoost, two advanced gradient-boosted tree algorithms. These models predicted early-season biomass (from tillering to booting) with moderate accuracy. Later-season predictions were less accurate due to fewer observations and greater variability in spring biomass, as cover crops receive less controlled management than cash crops. To better represent uncertainty, the CatBoost model was expanded with quantile regression, allowing predictions to be expressed as intervals (10–90%) rather than single values. For example: instead of reporting “biomass is 4,000 kg ha⁻¹,” model results can be reported as “the median is 4,000, with most outcomes between 3,200 and 4,800 kg ha⁻¹.” This provides growers with more informative, decision-relevant guidance.
Overall, the study shows that publicly available soil, weather, and satellite datasets—combined with limited management information—can support interpretable, uncertainty-aware predictions suitable for improving cereal rye cover crop management. For more information, see our open‑access article in Agricultural and Environmental Letters: https://doi.org/10.1002/ael2.70055.
News
New large language model (LLM) AI infrastructure and tools on SCINet’s supercomputers
Over the last few months, we’ve launched new LLM-related infrastructure and tools on SCINet’s supercomputers! These resources offer new ways for ARS scientists to leverage LLMs for their research projects.
- SCINet Chat is a new, web-based chat interface for interacting with LLMs on SCINet that is available on Open OnDemand (OOD) on Atlas. Beyond answering general research questions, SCINet Chat also has access to extensive SCINet user documentation and can provide custom answers to questions about using SCINet’s supercomputers.
- Jupyter-ai, an LLM-powered coding assistant extension for JupyterLab, is also available through OOD on Atlas.
- An LLM API that supports scripting, automation, and connections with other software is accessible on Atlas.
- Llama.cpp is now available as a module on both Atlas and Ceres. Llama.cpp supports querying many different open-weight LLMs on either GPUs or CPUs.
- A curated library of U.S.-made open LLMs is available on both Atlas and Ceres in /reference/llms. Model weights are provided in both safetensors and GGUF formats and can be used with llama.cpp as well as other LLM inference packages.
All of these new resources are built upon open-source software and open LLMs running on SCINet hardware. This promotes reproducibility and also ensures that LLM inputs and outputs never leave SCINet boundaries. For more information, please see the new LLM guide on the SCINet website, and stay tuned for further developments and training opportunities!
Welcome, new SAC members!
SCINet’s Scientific Advisory Committee (SAC) provides guidance and feedback from the SCINet user community to help inform SCINet’s infrastructure development, education and training, communication, research support, and technical user support activities. In short, the SAC helps ensure that SCINet meets the needs of as many ARS researchers as possible. SAC members serve three-year terms and represent the many scientific research programs across ARS.
This spring, the SAC held elections to choose four new members. The newly elected members are:
- Dr. Amitava Chatterjee (Soil, Water, and Air Resources Research Unit, National Laboratory for Agriculture and the Environment, Ames, IA)
- Dr. Amy Hudson (Arthropod-borne Animal Diseases Research Unit, Center for Grain and Animal Health Research, Manhattan, KS)
- Dr. Yulin Jia (Dale Bumpers National Rice Research Center, Stuttgart, AR)
- Dr. Brian Mack (Food and Feed Safety Research Unit, New Orleans, LA)
Many thanks to each of you for your willingness to serve on the SAC and help make SCINet better! Officers for this year are Dr. Hye-Seon Kim (SAC chairperson), Dr. Michael Branstetter (vice-chairperson), and Dr. Peter Olsoy (secretary).
We also want to thank the SAC members whose terms ended this year: Dr. Jeremy Edwards (former chairperson), Dr. Jason Fiedler (former vice-chairperson), Dr. Maggie Woodhouse (former secretary), and Dr. Geoff Waldbieser. Thank you for serving on the SAC, and we will miss your input!
Protein Science Working Group interest survey
The Protein Science Working Group (PSWG) would like to hear from you! The PSWG helps ARS researchers learn about and use exciting new computational methods for understanding protein structure, protein function, and the relationship between proteins and phenotype. If you’d like to learn more, please consider completing the PSWG’s survey to indicate your interest in the working group and to provide your feedback and ideas on how the working group could best support your research needs.
AI-COE/SCINet Graduate Student Internships update
We are pleased to announce that we have matched 21 graduate student interns with ARS AI-COE/SCINet internship opportunities based on the mutual interests of each student and their prospective mentor. These interns will work on (or are already working on) a wide variety of research projects, all of which include significant artificial intelligence/machine learning or data science components. Nineteen of these internships will begin this summer; two interns are already working with ARS mentors in spring internships. Many thanks to the 46 ARS scientists who applied to serve as AI-COE/SCINet graduate student internship mentors in 2026!
We are again planning a virtual internships research symposium in the fall, and we expect to announce details this summer!
FY26 SCINet/AI-COE postdoctoral fellowship mentor and AI Innovation Fund proposals update
We are still in the process of reviewing SCINet/AI-COE postdoctoral fellowship mentor proposals and AI Innovation Fund proposals and not yet ready to announce the FY2026 awardees. This year, we received more proposals for each program than ever before! We will complete the review process over the next few weeks, and all applicants will be notified of the results. We will formally announce the winners in the next edition of the SCINet newsletter.
MATLAB on SCINet’s supercomputers
Recently, a number of ARS researchers have inquired about using MATLAB on SCINet’s supercomputers. We are pleased to announce that it is now possible to use MATLAB on Ceres! However, it does require that each MATLAB user obtain their own MATLAB license. That is, SCINet cannot purchase MATLAB licenses at this time. If you already have a MATLAB license that you would like to use on Ceres, please contact SCINet technical support at scinet_vrsc@usda.gov for assistance. If you are interested in purchasing a license for use on Ceres, please also contact scinet_vrsc@usda.gov for help to make sure that the license you purchase will work with SCINet systems.
Reminder: Geneious Prime license change
If you are still using Geneious Prime floating licenses from SCINet’s Geneious Prime license server, please remember that these floating licenses will expire on May 28, 2026. Biomatters, the company that produces Geneious Prime, has decided to no longer support floating licenses. This means that after May 28, all ARS Geneious Prime users will need to have their own licenses, which are available at a substantial discount through the SLIM process. If you are among the more than 200 Geneious users who requested a license during the SLIM ordering window, you will receive additional guidance once your new license is available. If you missed the ordering window, a limited number of additional licenses will be available before the window opens again next year. Licenses purchased through the SLIM process can be used either on SCINet’s clusters or on local hardware.
SCINet Working Groups
SCINet working groups (WGs) support ARS researchers and their collaborators in using scientific computing methods and SCINet computational resources in their research. Common WG activities include hosting recurring virtual meetings and webinars, organizing training events, and participating in collaborative research or software development projects.
The March 26, 2026 SCINet Corner featured short presentations from four SCINet working groups. Check out the session recording to learn more about these working groups and how you can get involved!
Current Working Groups
- Ag100Pest Initiative (subgroup of AGR)
- Animal Behaviour AI Working Group
- Arthropod Genomics Research (AGR) Working Group
- Breeding AI and ML Working Group
- Geospatial Research Working Group
- Microbiome Working Group
- SCINet-Longterm Agroecosystem Research (LTAR) Phenology Working Group
- Protein Science Working Group
- Translational Omics Working Group
Geospatial Working Group
Former postdoc Dr. Jiawei Li and advisors Drs. Huihui Zhang and David Barnard in the Water Management and Systems Research Unit in Fort Collins, CO used SCINet resources to fine-tune deep learning models to improve the detection of shrub crown boundaries in complex semi-arid environments through UAS imagery. This study clarifies the technological steps needed to advance shrub-level fuel mapping in data-scarce semi-arid landscapes. Improving UAS workflows, such as optimizing flight altitude, seasonal data acquisition, and multispectral integration, will be essential for producing more accurate, high-resolution shrub structure maps that directly support future wildfire research, including fine-scale fuel continuity assessments, fire behavior modeling, and post-fire vegetation recovery monitoring (Li et al. 2025, Remote Sensing https://doi.org/10.3390/rs17132275).
Dr. Alex Hernandez in the Forage and Range Research Laboratory in Logan, UT worked with collaborators to map the usable space of wild horses and burros across horse management areas spanning 31.4 million acres in the western US and identify changes through time. Resultant maps can be used for monitoring changes in habitat and detection of degradation/enhancement for each individual horse management area on an annual basis. The horse management areas were ranked based on their resilience to changes in precipitation and, therefore, the ability to maintain greater proportions of high‐quality habitats through time (Hernandez et al. 2025 Wildlife Society Bulletin https://doi.org/10.1002/wsb.70015).
Protein Science Working Group
A group of USDA-ARS-supported crop database projects were recently featured in the journal GENETICS special issue on “Knowledgebase and Database Resources,” highlighting how public agricultural databases are advancing genomics, breeding, and data-driven discovery across major crops. Featured resources included MaizeGDB, GrainGenes, SorghumBase, SoyBase, the Legume Information System, PeanutBase, and CottonGen. Collectively, these projects showcase the growing role of SCINet-enabled research and infrastructure in agriculture, from building and serving pangenomes and large-scale variant datasets to supporting AI-ready genomics, reproducible workflows, interactive analysis tools, and community-accessible breeding resources. Together, they demonstrate how USDA crop databases are turning complex biological data into practical resources that accelerate gene discovery, trait analysis, and crop improvement for U.S. agriculture.

Genetics Volume 232, Issue 4 April 2026
Creating a working group
If you are interested in creating a working group, please compile the following:
- The working group’s name
- A description of the working group including its purpose and goals
- Contact information for people to reach out to if they want to learn more about or join the working group.
Send this information to the SCINet office at ARS-SCINet-Office@usda.gov.
Training
Training workshops
Automating bioinformatics workflows workshop series
Leads: Genome Informatics Facility at Iowa State University and the SCINet Office
This workshop series explores tools and platforms used for automating and streamlining bioinformatics analyses. Participants will learn how to build reproducible and efficient workflows for managing biological data and computational workflows.
Series Outline:
- RNAseq and variant calling pipelines in Galaxy: June 22, 24-25, 1-5 PM ET
- Automating bioinformatics pipelines with Nextflow: July 7 & 9, 1-5 PM ET
- Introduction to Snakemake: July 21 & 23, 1-5 PM ET
To register, please fill out this registration form.
Foundations in bioinformatics workshop series
Leads: Genome Informatics Facility at Iowa State University, ARS researchers (Sheina Sim, Craig Carlson, and Haley Arnold), and the SCINet Office
The SCINet Office is offering a series of workshops to help ARS researchers develop practical skills for using bioinformatics in their research. The workshops in this series are designed to provide a thorough introduction to modern bioinformatics concepts, techniques, and best practices.
Series outline:
- Introduction to modern bioinformatics: April 13, 15-16, 2026, 1-5 PM ET
- Genome assembly: April 20, 22-23, 2026, 1-5 PM ET
- Introduction to RNA-seq analysis: May 4, 6-7, 2026, 1-5 PM ET
- Genome annotation: May 11 & 13, 2026, 1-5 PM ET
- From reads to variants: GATK & Deepvariant: May 19 & 20, 2026, 2-5 PM ET
At this time, registration is closed as we have reached maximum capacity for the workshops. However, you may complete the registration form to be added to our waitlist for future offerings.
The Carpentries instructor training
SCINet is collaborating with The Carpentries to offer The Carpentries’ Instructor Training Course for ARS scientists. In this course, you will learn about evidence-based practices for effective and inclusive teaching, with a particular focus on teaching computational skills. There is no fee charged to course participants, but seats are limited.
If you are interested in becoming a Carpentries-certified instructor, please complete this form.
Coursera
The SCINet Office and the AI-COE are excited to provide training opportunities through Coursera. Coursera licenses are available to ARS scientists and support staff for training focused on scientific computing, data science, artificial intelligence, and related topics. Successful completion of courses and specializations result in widely recognized certificates and credentials.
Please visit the SCINet Coursera Training Page to request a license. Licenses will be assigned on a rolling basis and are active for three months. Users may be able to extend their licenses upon request.
Workshop Reports
RNA-seq analysis with Galaxy
Leads: Genome Informatics Facility at Iowa State University and SCINet Office
The SCINet Office, in collaboration with the Genome Informatics Facility at Iowa State University, hosted a workshop on RNA-seq data analysis with Galaxy on January 26 & 28, 2026.
This hands-on workshop introduced participants to RNA-seq data analysis using the Galaxy platform. Participants learned how to navigate the Galaxy interface and create an RNA-seq analysis pipeline including data processing, quality control, read alignment, and exploration of gene expression results. We also showed participants how to build and share workflows in Galaxy, and highlighted the benefits of collaborating through the Galaxy platform.
This workshop filled to the 40-participant capacity, and we are offering this workshop again on June 22, 24-25, 2026 as part of the “automating bioinformatics workflows” series. To register, please fill out this registration form and see the announcement above for more information about this upcoming workshop series.
Carpentries: Unix, Git, and Python
Leads: Keo Corak (ARS Computational Biologist), Amisha Poret-Peterson (ARS Research Microbiologist), and Steven Schroeder (ARS Computational Biologist)
The SCINet Office, in collaboration with Carpentries-certified ARS instructors, held a workshop on the Unix command line, version control with Git, and Python programming. The workshop spanned two weeks:
- Unix command line and version control with Git: February 3 & 5, 2026, 1-5 PM ET
- Programming with Python: February 11 & 13, 2026, 1-5 PM ET
This hands-on workshop provided participants with foundational skills in Unix command line, version control with Git, and data analysis and visualization with Python. There was a 30-participant cap for the workshop, and this will be a recurring workshop.
Transfer learning
Leads: Research Computing team at the University of Florida
In collaboration with the Research Computing team at the University of Florida, we offered a transfer learning workshop as part of the Practicum AI workshop series on February 24 & 26, 2026.
This workshop covered the foundational concepts and practical applications of transfer learning, a powerful technique in deep learning that allows AI models to leverage pretrained knowledge to improve performance on new tasks. Participants learned about different types of transfer learning techniques, such as feature extraction and fine-tuning.
The course recordings and tutorial instructions are available on the SCINet website.
Using large language models (LLMs) on SCINet’s supercomputers
Leads: SCINet Office
The SCINet Office offered a workshop on using large language models (LLMs) on SCINet’s supercomputers on February 18 & 20, 2026 as well as April 29 & May 1, 2026. This two-day workshop covered key concepts for understanding how LLMs work and introduced participants to a variety of use cases, from basic “chat” interactions to more advanced applications, including retrieval and summarization and high-throughput automation.
To sign up for the waitlist for future offerings, fill out this form.
Please help us improve our training offerings!
What scientific computing training do you need? The SCINet Office’s goal is to provide training opportunities and resources that meet the needs of ARS researchers, so we would be grateful if you could complete our short training request form and let us know how we can best help you learn the computing skills you need. Your feedback will help us decide where we should focus our efforts over the next year and beyond.
Training opportunities are continually being updated on the SCINet Upcoming Events webpage. For more information on any of the above trainings, registration questions, or suggestions, please email SCINet-training@usda.gov.
Support
Getting Started with SCINet is as easy as 1,2,3
If you do not already have a SCINet account, we hope you will consider joining the 2,300+ researchers who do. Follow the steps below to get started with SCINet.
- Request a SCINet account to gain access to computational and training resources.
- Read the SCINet FAQs covering helpful topics such as account management, accessing and installing software, obtaining storage space for your project(s), and how to get technical help.
- Visit the SCINet Forum to connect to other users, ask questions, and learn how SCINet can enable your research. P.S. Don’t forget to complete your annual USDA information security awareness training! This is required to maintain your account. For technical assistance with your SCINet account, please email scinet_vrsc@usda.gov.
Support email addresses
All requests for help with user accounts, login problems, resource requests, or support for the Ceres HPC cluster should be sent to the SCINet Virtual Research Support Core (VRSC) at scinet_vrsc@usda.gov. Help requests specific to the Atlas HPC cluster should be sent to help-usda@hpc.msstate.edu.
Many emails are currently being sent to other SCINet email inboxes. For the most expedient response to your support requests, be sure to send them to scinet_vrsc@usda.gov or to help-usda@hpc.msstate.edu for Atlas-specific requests.
SCINet User Tip
Using BioContainers to run BioConda packages
Do you use BioConda packages for your bioinformatics workflows on SCINet? If so, you might consider using BioContainers! BioContainers make it easy to use BioConda packages without having to create and manage conda environments.
All packages available via the BioConda conda channel are automatically containerized and available through the BioContainers registry. (The actual container images are hosted in both Quay and Galaxy Depot.)
To use BioContainers on either Ceres or Atlas, you will first need to load the apptainer module. Then, go to the BioContainers Registery, search for the package you’d like to use, and follow the link to the package page. From there, scroll down to the “Singularity Installation” section, where you will find the command to use for downloading and running the container. You will need to make two changes to the command:
- Instead of using
singularity run, you must useapptainer run. - Add the name of the target software program at the end of the command to automatically launch it.
And that’s it! As an example, suppose you would like to run minimap2 using BioContainers. After searching for “minimap2” in the BioContainers Registery, you will find the page for the minimap2 package. The minimap2 package page suggests running singularity run https://depot.galaxyproject.org/singularity/minimap2:2.28--h577a1d6_4, which can be converted for use on either Ceres or Atlas as follows.
module load apptainer
apptainer run https://depot.galaxyproject.org/singularity/minimap2:2.28--h577a1d6_4 minimap2
If you need a specific version of the application, click on the “Packages and Containers” tab and find the singularity command for the specific version you are looking for.
Although there are many cases in which managing a conda environment might still be preferred, BioContainers can be a handy alternative!
Do you have tips to share? Email them to ARS-SCINet-Office@usda.gov to be included in future newsletters.
SCINet Corner
SCINet Corner is a VRSC-moderated virtual space for people to share knowledge, discuss best practices, learn about new opportunities, and explore resources to support progress on their projects.
The next SCINet Corner will be held on May 28, 2026, from 1-2 PM ET. May’s event will continue April’s topic on working with Slurm by covering monitoring and analyzing Slurm jobs.
You can register for this and future SCINet Corners here.
Have a question that just can’t wait? Want to see what other users are doing? Reach out to the ever-expanding SCINet Forum community for ideas, support, or just someone to bounce ideas off of at https://forum.scinet.usda.gov/.
Connect
The SCINet Community
To see all the SCINet community updates and review past newsletters, visit the Newsletter Archive.
Contribute
Do you use SCINet for your research? We would love to share your story! Email ARS-SCINet-Office@usda.gov to contribute content, ask questions, or provide feedback on the SCINet newsletter or website.
SCINet Office
Haitao Huang, Computational Biologist
Moe Richert, Web Developer
Lavida Rogers, Training Coordinator
Heather Savoy, Computational Biologist
Brian Stucky, Computational Biologist, Acting Chief Scientific Information Officer
SCINet Leadership Team
Brian Stucky, Acting Chief Scientific Information Officer
Rob Butler, SCINet Program Manager
Hye-Seon Kim, Science Advisory Committee (SAC) Chair
Jeff Silverstein, Associate Administrator