kebnekaise

2024-06-12 15.00 Compute clusters and access nodes down for maintance on power

  • Posted on: 3 June 2024
  • By: brorerik

Wednesday 2024-06-12 15.00 - Thursday 2024-06-13 12.00

Akademiska Hus is performing a perodical revision of the power to the MIT-building.

This affects the power to the compute clusters at HPC2N including the access nodes.

The work requires the power to be turned off and thus all clusters will need to be drained and taken down.

We have therefor set a reservation on the clusters preventing any jobs from running during that time frame.

 

  • Posted on: 16 October 2023
  • By: bbrydsoe

2023-06-09 A mishap with Slurm caused a loss of the job accounting data for Kebnekaise jobs today between 00:00 and 16:40

  • Posted on: 9 June 2023
  • By: brorerik

2023-06-09 A mishap with Slurm caused a loss of the job accounting data for Kebnekaise jobs today between 00:00 and 16:40

We can see no other effect on running jobs and the job queue are now open again after having been DOWN for 1 hour

If you see some other negative effect send us a support case and we'll help solving the issue

Sorry for the inconvenience that this may have caused.

 

Best regards,

/Support

2023-01-30 07:00 Planned maintenance of the cooling systems and central file system (FINISHED 2023-02-02 20:30)

  • Posted on: 20 January 2023
  • By: brorerik

Akademiska hus have a planned maintenance of the cooling systems for the HPC2N Infrastructure computer hall on 2023-02-01

We'll coordinate an upgrade of the central file system around their maintenance to minimize the time the cluster is draining jobs.

The combined maintenance window will therefore start on 2023-01-30 07:00 and according to our planning end on 2023-02-03 16:00

All Kebnekaise nodes, central storage and the login nodes will be unavailable during this time.

2022-12-05 File system down, login not working (SOLVED 23:58)

  • Posted on: 5 December 2022
  • By: brorerik

We are currently experiencing file system server problems.

This is blocking logins and is also affecting running jobs.

We're working to get it back online but currently have no ETA for this.

UPDATE 23:58

The issues has now been resolved and all systems are working normally and the jobs queues are active,

UPDATE 17:30

The work with the file system verification continues, the job queues will not be up until late this evening or around 09.00 tomorrow.

UPDATE 13:10

2022-08-05 08:32 File system server problems, logins affected (SOLVED 2022-08-05 10:20)

  • Posted on: 5 August 2022
  • By: brorerik

We are currently experiencing file system server problems.

This causes problems with logins and any access to the file system.

We're working to get it back online but currently have no ETA for this.

 

UPDATE  2022-08-05 10:20

The problem has been solved and the system is back online working normally

Sorry for the problems this has caused

/Support

Pages

Updated: 2024-11-01, 13:56