[mephi-hpc] и снова нет доступа к rhic

Alexandra Freidzon freidzon.sanya at gmail.com
Wed Nov 22 17:24:39 MSK 2017


Добрый день,

то, что Черенков заработал -- это хорошо. Но тормоза в /mnt/pool/3 и 4
просто безбожные. А по команде dmesg | less вылезает примерно такой
ужас:

[2558309.519504] nfs: server 192.168.137.252 not responding, timed out
[2558615.677485] nfs: server 192.168.137.252 not responding, timed out
[2558921.835238] nfs: server 192.168.137.252 not responding, timed out
[2559227.993087] nfs: server 192.168.137.252 not responding, timed out
[2559534.151014] nfs: server 192.168.137.252 not responding, timed out
[2559834.677126] nfs: server 192.168.137.252 not responding, timed out
[2560065.063502] nfs: server 192.168.137.252 not responding, timed out
[2560065.063773] nfs: server 192.168.137.252 not responding, timed out
[2560263.195780] nfs: server 192.168.137.252 not responding, timed out
[2560309.287821] nfs: server 192.168.140.251 OK
[2570882.022947] nfs: server 192.168.150.2 not responding, timed out
[2621162.733726] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621166.741974] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621171.749697] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621176.757392] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621182.764574] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621186.772782] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621191.780521] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621197.787515] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621201.795920] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621206.803630] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621211.811286] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621216.818989] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621221.826723] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621227.833869] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621231.842167] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621236.849847] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621242.857010] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621246.865210] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621251.872960] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621256.880622] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621261.888341] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621266.896063] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621272.903409] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621277.919369] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621282.927086] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13
[2621287.934786] NFS: state manager: check lease failed on NFSv4
server 192.168.140.251 with error 13

Работать ну ооооочень тяжело.

22 ноября 2017 г., 11:54 пользователь anikeev <anikeev at ut.mephi.ru> написал:
> On Tue, 2017-11-14 at 11:06 +0300, anikeev wrote:
>> On Sat, 2017-11-11 at 22:37 +0300, Alexandra Freidzon wrote:
>> > и все 4 pool-а тоже отвалились
>
> Добрый день!
>
> Хранилища /mnt/pool/3, /mnt/pool/4 и кластер Черенков официально
> возвращены в эксплуатацию. Причиной неполадок была ошибка в микрокоде
> процессора, проявившаяся на новом ядре ОС с низкой частотой повторения.
> Микрокод обновлён. Собранная после обновления статистика аномалий не
> выявила. Фактическая экспериментальная эксплуатация идёт с 14 ноября.
>
>> Добрый день!
>>
>> Введена в пробную эксплуатацию новая схема подключения /mnt/pool/1 ,
>> /mnt/pool/2 и /mnt/pool/rhic к Университетскому кластеру и кластеру
>> Басов. Работы на хранилищах /mnt/pool/3, /mnt/pool/4 и кластере
>> Черенков продолжаются. Новая информация будет поступать по мере
>> продвижения работ.
>>
>> > 11 ноя 2017 г. 13:38 пользователь "Олегъ" <oleg.golosov at gmail.com>
>> > написал:
>> > > хоть к гадалке не ходи...
>> > > как выходные - отваливается rhic...
>> > > очень тяжело так работать...
>> > > _______________________________________________
>> > > hpc mailing list
>> > > hpc at lists.mephi.ru
>> > > https://lists.mephi.ru/listinfo/hpc
>> > >
>> >
>> > _______________________________________________
>> > hpc mailing list
>> > hpc at lists.mephi.ru
>> > https://lists.mephi.ru/listinfo/hpc
> --
> С уважением,
> инженер отдела Unix-технологий МИФИ,
> Аникеев Артём.
> Тел.: 8
> (495) 788-56-99, доб. 8998
> _______________________________________________
> hpc mailing list
> hpc at lists.mephi.ru
> https://lists.mephi.ru/listinfo/hpc


More information about the hpc mailing list