SCINet Past Scheduled Outages
The table below lists information about past SCINet outages. See SCINet Forum Announcements page (must have a SCINet account to access) for communications about emergency outages.
Software Update · Ceres - All - Monday, October 9 · 2023
Ceres cluster maintenance is scheduled for October 9-10, 2023 (Indigenous Peoples Day, and the following day), to update system software.
During the maintenance we will also upgrade Open OnDemand to version 3 and BeeGFS file system to version 7.4.
Queued jobs will not start if they cannot complete by 6AM October 9. In the output of the squeue command the reason for those jobs will state (ReqNodeNotAvail, Reserved for maintenance) . The jobs will start after the scheduled outage completes.
Atlas cluster will be available during the Ceres maintenance. Make sure to copy data from Ceres to Atlas prior to the maintenance if needed.
Please submit any questions you may have via email to scinet_vrsc@usda.gov.
Maintenance · Site Service - Ames - Friday, September 29 · 2023
ARS SCINet Site Service Ames will be unavailable while Internet2 circuit vendor Lumen performs circuit maintenance. The entire window is reserved.
Maintenance · Site Service NAL - Beltsville · 2023
Site Service NAL (Beltsville) will be unavailable while Fiberlight engineers perform maintenance. Outages are expected. The entire maintenance window is reserved.
Maintenance · Backbone - NAL · 2023
Backbone NAL-NAL will be unavailable while Fiberlight engineers perform maintenance. Outages are expected. The entire maintenance window is reserved.
Maintenance · Site Service - AMES, NAL - Tuesday, August 29 · 2023
Site Service at AMES and NAL will be impacted while Internet2 performs maintenance to upgrade core nodes. Outages are expected and the entire window is reserved.
Emergency Maintenance · Site Service - Ames - Friday, July 7 · 2023
The listed asset will be unavailable while vendor Internet2 performs a software maintenance and troubleshooting tasks on core1.eqch. Multiple 20 minute hard down events are expected. The entire window is reserved. </br></br> This will not affect the Ceres cluster and the jobs.
Maintenance · Site Service - Beltsville - Friday, June 23 · 2023
Site Service Beltsville will be unavailable while Fiberlight engineers performs maintenance. Outages are expected. The entire maintenance window is reserved.
Maintenance · Site Service - Stoneville - Wednesday, June 21 · 2023
Site Service at Stoneville will be impacted while Internet2 performs maintenance to upgrade core nodes. Outages are expected and the entire window is reserved.
Maintenance · Site Service - Multiple locations - Tuesday, June 20 · 2023
Site Service at Fort Collins, Albany & Clay Center will be impacted while Internet2 performs maintenance to upgrade core nodes. Outages are expected and the entire window is reserved.
Maintenance · Site Service - Multiple - Monday, June 19 · 2023
Site Service at Ames & Beltsville will be impacted while Internet2 performs maintenance to upgrade core nodes. Outages are expected and the entire window is reserved.
System Update · Ceres - All · 2023
Ceres cluster maintenance is scheduled for the week of June 19, to update system software. The cluster will be down for several days.
Maintenance · Site Service - Beltsville - Sunday, June 18 · 2023
Site Service Beltsville will be unavailable while Fiberlight engineers performs maintenance. Outages are expected. The entire maintenance window is reserved.
Maintenance · Juno - All - Tuesday, June 13 · 2023
A planned maintenance evolution will occur on Tuesday, June 13th, 2023, between 6am and 5pm ET at the National Agricultural Library (NAL).
This maintenance is necessary to transfer core network equipment at NAL onto newer and more reliable backup power which will promote future stability and reliability for services at this site.
During this time, access to Juno storage will be disrupted. We apologize in advance for any inconvenience this may cause.
We will be working closely with our partners to minimize the impact of this maintenance and hope to complete the work early. We will provide updates on the status of the maintenance (on the SCINet Forum)[https://forum.scinet.usda.gov/t/access-to-juno-storage-disrupted-on-june-13-2023].
Maintenance · Site Service - Ames - Monday, May 22 · 2023
The listed assets may become unavailable due to scheduled maintenance being preformed by Internet2 vendor Lumen. Outages are expected. The entire window is reserved.
Maintenance · Site Service - Ames - Thursday, May 18 · 2023
Site Service at Ames will be impacted while Lumen performs maintenance.
Outages are expected and the entire window is reserved.
Maintenance · Site Service - Stoneville - Thursday, May 18 · 2023
The listed assets will be unavailable while Internet2 engineers perform Core Node maintenance. Outage are expected. The entire window is reserved.
Maintenance · Juno - all - Wednesday, May 10 · 2023
At 6:00 PM Eastern on May 10th, the Juno long term storage system at Beltsville will be unmounted from SCINet DTNs and become inaccessible.
This is being done in preparation for network maintenance to be performed after hours.
The storage will be remounted, and access restored, the following morning.
Maintenance · Site Service - Fort Collins - Tuesday, May 2 · 2023
Site service Fort Collins will be unavailable while BISON engineers performs maintenance. Outages are expected. The entire maintenance window is reserved.
Maintenance · Atlas - all - Monday, May 1 · 2023
In order to replace a valve in the cooling loop supply for the atlas cluster system, a reservation has been made for Monday, May 1 beginning at 3:00am CST.
- No running jobs will be killed.
- All jobs that can not complete before the maintenance start time will be held and started once the system has returned to operation.
Maintenance · Site Service - Ames - Wednesday, April 26 · 2023
Site Service at Ames will be impacted while Lumen performs maintenance. The entire window is reserved. Outages are expected and the entire window is reserved.
Maintenance · Ceres - All · 2023
The data center that hosts Ceres cluster will have reduced cooling capacity starting the morning of April 12 and lasting through the end of the week.
To lessen heat production generated by Ceres compute nodes during this maintenance a reservation has been created. New jobs will not start if they cannot complete by 6:00AM on April 12, 2023.
In the output of the squeue command, the reason for those jobs will state (ReqNodeNotAvail, Reserved for maintenance) The jobs will start after the scheduled outage completes.
Idle nodes will be turned off. Running jobs that had started prior to reservation will be allowed to continue running as long as the temperature in the data center does not exceed the set threshold.
The login and DTN nodes, as well as storage are scheduled to stay up.
More nodes may be turned back on and be available for jobs on Thursday and Friday.
The Ceres cluster is expected to run at full capacity starting Monday, April 18.
Maintenance · Atlas - All (Atlas offline) - Tuesday, April 4 · 2023
The Mississippi State University High Performance Computing Collaboratory’s (MSU/HPC2) Computing Office has scheduled maintenance for the Atlas cluster.
During this maintenance window, the compute nodes and all support nodes including the login, devel, dtn, ood, etc… and those services including cron, globus, login, will be shutdown and unavailable.
Helpdesk tickets should be submitted for any associated problems.
Maintenance · SCINet - Albany - Thursday, March 2 · 2023
The Albany site location will experience loss of connectivity to SCINet intermittently during the hours of 4:00 pm to 6:00 pm EST on March 2, 2023.
Maintenance · Ceres - All (Ceres offline) · 2023
Maintenance · Ceres - All (/project) - Thursday, October 27 · 2022
Due to recent issues with Ceres’ /project storage hardware, it needs to be replaced. The replacement hardware is expected to be delivered by end of the day on 10/26/2022 and the works will probably be done on 10/27/2022.
Before replacing the hardware, we will post on the SCINet Forum and update the message of the day displayed at login to Ceres.
While replacing the hardware, Ceres’ /project will not be accessible. We plan to suspend all running jobs before unmounting /project and resume the jobs once the maintenance completes.
While we expect this will not affect running jobs, we recommend submitting new jobs to run on /90daydata to minimize the risk of the job dying due to this maintenance.
Maintenance · Ceres - All (Ceres offline) · 2022
Maintenance · Ceres - All (Ceres offline) · 2022
Maintenance · Atlas - All (connections to Atlas) - Tuesday, May 17 · 2022
Maintenance · Ceres - All (Ceres offline) - Monday, February 21 · 2022
Maintenance · SCINet - Stoneville - Thursday, January 20 · 2022
The maintenance window is one (1) hour in duration. This will impact service to the Stoneville site only.
Full cluster Maintenance · Atlas - All (connections to Atlas) - Wednesday, December 8 · 2021
Wednesday, December 8, beginning at 8am CST, the HPC2 Computing Office has scheduled maintenance for the atlas compute cluster. During this maintenance window, the login, devel, dtn, ood, and compute nodes for atlas will be unavailable and all associated cron jobs will be disabled.
Downtime is expected to last most of the day. For any associated problems, submit a help desk ticket:
- help-usda@hpc.msstate.edu - specific atlas issues
- scinet_vrsc@usda.gov - general operational issues
Network Maintenance in Ames · SCINet - All (connections to Ceres) - Thursday, November 18 · 2021
SCINet network maintenance has been scheduled for Ames, IA. The maintenance window is from 8:30 to 10:30 Central Time (1430-1630 UTC) on 18 November 2021. Connectivity to SCINet will be sporadic during the maintenance window.
Network Maintenance in Ames · SCINet - All (connections to Ceres) - Tuesday, November 16 · 2021
Connectivity to SCINet will be sporadic during the maintenance window.
Network Maintenance in Albany · SCINet - Albany - Monday, November 15 · 2021
Local connectivity to SCINet will be sporadic during the maintenance window.
Maintenance · Ceres - All (Ceres offline) - Thursday, November 11 · 2021
Ceres maintenance is scheduled for Thursday, November 11, 2021 to upgrade internal cluster network.
Queued jobs will not start if they cannot complete by 6AM November 11. These include jobs submitted to the long partition with the default 3-weeks long time limit. In the output of the squeue command the reason for those jobs will state (ReqNodeNotAvail, Reserved for maintenance). The jobs will start after the scheduled outage completes.
The Atlas cluster will stay up and running during Ceres downtime. All Ceres users can run jobs on Atlas and use /90daydata that has no quotas.
Fiber relocation · Ceres - All (connections to Ceres) · 2021
The listed asset will be unavailable while Lumen engineers perform preventative fiber relocation work. Outage is expected to be two hours each day, but up to 5 hours is possible. The entire window is reserved.
Network update · Ceres, Juno - All (connections to Ceres, Juno) - Thursday, October 28 · 2021
A maintenance window has been scheduled for 28 October 2021 from 1530 - 1730 UTC (10:30am to 12:30pm Central time) to stabilize router (Albany MX480 RE Downgrade).
Periodic outages will be experienced as equipment is rebooted. Connectivity to Ceres and Juno cannot be guaranteed during the maintenance window.
Network update · Ceres, Juno - All (connections to Ceres, Juno) - Tuesday, October 26 · 2021
A maintenance window has been scheduled for 26 October 2021 from 4:30pm to 8:30pm Central time to stabilize the SCINet Network. Periodic outages will be experienced as equipment is rebooted. Connectivity to Ceres and Juno cannot be guaranteed during the maintenance window.
Router update · Ceres - All (connections to Ceres) - Tuesday, October 19 · 2021
The router at Ames will be rebooted on or about 4:30 CT. The reboot should be about 15 minutes. After that the router will be upgraded to the latest OS. Outages may occur during that process.
Router update · SCINet - various · 2021
More SCINet network hardware OS updates. Check the announcement page for more details
OS Upgrade · SCINet - various · 2021
GNOC plans to upgrade the OS on the SCINet gear at the 6 locations. This will result in connectivity interruptions during the upgrade. The upgrade schedule is the following:
- Albany - 9/16 8AM PST
- Clay Center - 9/16 4PM CST
- Ames - 9/17 8AM CST
- Stoneville - 9/20 8AM CST
- NAL - 9/20 3PM CST
- CSU - 9/21 9AM CST
Maintenance · Ceres - All (connections to Ceres) · 2021
This maintenance window will be longer than normal as there are several important hardware upgrades occurring during this window to enhance the overall power and capacity of the CERES HPC cluster. These upgrades include the remaining new priority nodes, sixty eight additional compute nodes, two additional high memory compute nodes, six management nodes, and faster Infiniband switching technology used by the HPC nodes to access storage. VRSC will re-rack and re-wire the whole cluster to accommodate additional hardware while adhering to power and cooling limits.
Queued jobs will not start if they cannot complete by 7AM August 23. These include jobs submitted to the long partition with the default 3-weeks long time limit. In the output of the squeue command the reason for those jobs will state (ReqNodeNotAvail, Reserved for maintenance). The jobs will start after the scheduled outage completes.
The Atlas cluster will stay up and running during Ceres downtime. All Ceres users can run jobs on Atlas. If you don’t have a large enough project quota on Atlas, remember that you can use /90daydata on Atlas that has no quotas
Outage · Ceres · 2021
Connection Restored on 07-21-2021
Maintenance · Ceres - All (connections to Ceres) · 2021
The listed assets will be unavailable while contractors perform testing on the elecrtical service switchgear, generators, and turbine. Outages throughout the window are expected. The entire window is reserved.
Maintenance · Atlas - All (connections to Atlas) - Tuesday, February 23 · 2021
The HPC2 Computing Office has scheduled a maintenance for its core networking services. During this time all network connectivity both inside and outside the HPC2 will be unavailable including access to the atlas cluster systems.
Maintenance · Ceres - All (Ceres offline) - Tuesday, February 16 · 2021
Maintenance · Ceres - All (Ceres offline) - Monday, February 15 · 2021
Maintenance · Ceres - All (Ceres offline) - Monday, October 12 · 2020
UPS Maintenance · SCINet - Stoneville - Tuesday, August 25 · 2020
SCINet equipment will be shutdown in order to perform Maintenance to the UPS. SCINet connectivity at the Stoneville location will be impacted. The Maintenance window is reserved from 0700 to 1600 Central Time.
Maintenance · Ceres - All (Ceres offline) · 2020
Planned power outage · SCINet & AWS - Multiple locations · 2020
SCINet equipment at the National Agricultural Library will be powered down in advance of a planned power outage to the NAL building. The outage is expected to last for 24 hrs or less. We expect that normal access to SCINet resources will be restored on or before Monday, April 20.
Please check Basecamp during the outage period for updates.
Router migration · SCINet - Ft Collins - Thursday, March 19 · 2020
Router replacement · SCINet - Ft Collins - Thursday, March 12 · 2020
Router replacement · SCINet - Clay Center - Monday, March 2 · 2020
Maintenance · Ceres - All (Ceres offline) - Monday, February 17 · 2020
Upgrades/expansion · Ceres - All (Ceres offline) · 2019
Ceres downtime is scheduled for Monday, December 2 - Friday, December 6. This downtime is to rewire both power and networking on Ceres for the addition of additional compute nodes and to ready it for storage expansion.
We do not anticipate any further extended downtimes for rewiring, as this should allow us to maximize the size of Ceres simply by adding additional compute nodes.
Since this affects the Authentication for SCINet, this will also affect logins to Data Transfer nodes at Ames, StoneVille, Fort Collins, Clay Center, Albany CA, and Beltsville.
GlobalNoc will also be upgrading software on the SCINet network infrastructure during this time.