Onboarding Videos
Users who are new to the HPC environment may benefit from the following Ceres onboarding video which covers much of the material contained in this guide plus some Unix basics.
Ceres Onboarding (Intro to SCINet Ceres HPC) (length 42:13)
Note: /KEEP storage discussed in the video at 16:20 is no longer available. Instead data that cannot be easily reproduced should be manually backed up to Juno. The instructional video at https://www.youtube.com/watch?v=I3lnsCAfx3Q demonstrates how to transfer files between local computer, Ceres, Atlas and Juno using Globus.
The video includes:
- logging on to Ceres
- changing your password
- home and project directories
- data transfer to/from SCINet clusters
- basic SLURM job scheduler commands
- computing in interactive mode with salloc
- accessing Ceres software modules
- computing in batch mode with a batch script
Technical Overview
Ceres is the dedicated high performance computing (HPC) infrastructure for ARS researchers on ARS SCINet. Ceres is designed to enable large-scale computing and large-scale storage. Currently, the following compute nodes are available on the Ceres cluster.
Number of Nodes | Processors per Node | Logical Cores per Node | Memory per Node | Local Storage | Constraint Flags |
---|---|---|---|---|---|
100 | Two 18-core Intel Xeon 6240 | 72 | 381 GB DDR3 ECC | 1.5 TB SSD | AVX, AVX2, AVX512, INTEL, CASCADELAKE, CERES19 |
76 | Two 24-core Intel Xeon 6240R | 96 | 381 GB DDR3 ECC | 1.5 TB SSD | AVX, AVX2, AVX512, INTEL, CASCADELAKE, CERES20 |
20 | One 128-core AMD Epyc 9754 | 256 | 2,305 GB DDR5 ECC | 2.9 TB SSD | AVX, AVX2, AVX512, AMD, EPYC9754, BERGAMO, CERES24 |
11 | Two 24-core Intel Xeon 6248R | 96 | 1,546 GB DDR3 ECC | 1.5 TB SSD | AVX, AVX2, AVX512, INTEL, CASCADELAKE, CERES20 |
6 | Two 20-core Intel Xeon 6248 | 80 | 1,546 GB DDR3 ECC | 1.5 TB SSD | AVX, AVX2, AVX512, INTEL, CASCADELAKE, CERES19 |
2 | Two 20-core Intel Xeon 6248 | 80 | 772 GB DDR3 ECC | 1.5 TB SSD | AVX, AVX2, AVX512, INTEL, CASCADELAKE, CERES19 |
For details on how to request a node with specific hardware, see the SLURM Resource Manager.
In addition there is a specialized data transfer node and several service nodes.
In aggregate, there are more than 10,500 compute cores (21,000 logical cores) with 138 terabytes (TB) of total RAM, 350 TB of total local storage, and 5.5 petabyte (PB) of shared storage.
Shared storage consists of 5.5 PB high-performance BeeGFS space and 300TB of backed-up ZFS space.
System Configuration
Since most HPC compute nodes are dedicated to running HPC cluster jobs, direct access to the nodes is discouraged. The established HPC best practice is to provide login nodes. Users access a login node to submit jobs to the cluster’s resource manager (SLURM), and access other cluster console functions. All nodes run on Linux CentOS 7.8.
Domain | Software |
---|---|
Operating System | Red Hat Enterprise Linux |
Scheduler | SLURM |
Software | For the full list of installed scientific software refer to the Preinstalled Software List page or issue the `module spider` command on the Ceres login node. |
Modeling | BeoPEST, EPIC, KINEROS2, MED-FOES, SWAT, h2o |
Compilers | GNU (C, C++, Fortran), clang, llvm, Intel Parallel Studio |
Languages | Java 6, Java 7, Java 8, Python, Python 3, R, Perl 5, Julia, Node |
Tools and Libraries | tmux, Eigen, Boost, GDAL, HDF5, NetCDF, TBB, Metis, PROJ4, OpenBLAS, jemalloc |
MPI libraries | MPICH, OpenMPI |
Profiling and debugging | PAPI |
For more information on available software and software installs refer to our guides on Modules, Singularity Containers and Installing R, Python, and Perl Packages.
Additional Guides for Ceres:
-
Logging In
If you have recieved your login credentials in an email, this guide will help you get connected to SCINet. Otherwise, please email the Virtual Research Support Core at scinet_vrsc@USDA.GOV for assistance.
-
Data Transfer
Data Transfer best practices.
Globus Online is the recommended method for transferring data to and from the HPC clusters.
-
Modules
The Environment Modules package provides dynamic modification of your shell environment. This also allows a single system to accommodate multiple versions of the same software application and for the user to select the version they want to use. Module commands set, change, or delete environment variables, typically in support of a particular application.
-
Quotas
Each file on a Linux system is associated with one user and one group. On Ceres, files in a user’s home directory by default are associated with the user’s primary group, which has the same name as user’s SCINet account. Files in the project directories by default are associated with the project groups. Group quotas that control the amount of data stored are enabled on both home and project directories.
At login, current usage and quotas are displayed for all groups that a user belongs to. The
my_quotas
command provides the same output:$ my_quotas
-
SLURM Resource Manager
Ceres uses Simple Linux Utility for Resource Management (SLURM) to submit interactive and batch jobs to the compute nodes. Requested resources can be specified either within the job script or using options with the
salloc
,srun
, orsbatch
commands. -
Compiling Software
The login node provides access to a wide variety of scientific software tools that users can access and use via the module system. These software tools were compiled and optimized for use on SCINet by members of the Virtual Research Support Core (VRSC) team. Most users will find the software tools they need for their research among the provided packages and thus will not need to compile their own software packages.
To learn more about graphical software such as Galaxy, CSC, Geneious, RStudio, and Juptyer, please select the Software Preinstalled on Ceres guide -
Citation/Acknowledgment
Add the following sentence as an acknowledgment for using CERES as a resource in your manuscripts meant for publication:
“This research used resources provided by the SCINet project and/or the AI Center of Excellence of the USDA Agricultural Research Service, ARS project numbers 0201-88888-003-000D and 0201-88888-002-000D.”