Affected services:

  • Crane

Crane: Network outage

Opened on Wednesday 23rd December 2020, last updated

Resolved

No further issues with the switch have been noted. We will investigate further during the upcoming downtime.

Posted by John Thiltges

Resolved

Switch replaced, all is well.

Posted by Garhan Attebury

Monitoring

Crane is back in service. Jobs which were running during the outage (Dec 23 at 5:00pm to Dec 24 at 10:45am) may not have completed successfully and we encourage you to check them for consistency.

The network switch connecting the crane head and login node stopped responding yesterday afternoon. The switch is back in service after a reboot. We will continue to monitor it and replace hardware if necessary.

Posted by John Thiltges

Identified

CRANE will remain offline until further notice as a networking component critical to the cluster has failed and we are unable to work around it remotely.

As there are hazardous weather conditions HCC staff will not be visiting the datacenter tonight and will attempt to remedy the situation tomorrow (December 24th) when travel is safer.

Posted by Garhan Attebury

Investigating

Crane is currently unreachable via the network. HCC staff are investigating the issue and will work to resolve it as soon as possible.

Posted by Adam Caprez