[mephi-hpc] ошибка в расчете

Богданович Ринат Бекирович RBBogdanovich at mephi.ru
Mon Dec 25 21:26:12 MSK 2017


Нужно запустить run11.sh или run10.sh в папке MCUPTR_11 или MCUPTR_10.

-----Original Message-----
From: hpc [mailto:hpc-bounces at lists.mephi.ru] On Behalf Of anikeev
Sent: Monday, December 25, 2017 5:51 PM
To: NRNU MEPhI HPC discussion list <hpc at lists.mephi.ru>
Subject: Re: [mephi-hpc] ошибка в расчете

On Mon, 2017-12-25 at 12:18 +0000, Богданович Ринат Бекирович wrote:
> Добрый день, возникают ошибки в расчете (до четверга все считалось 
> номрально, в течение последнего года).

Добрый вечер!

> Скажите, пожалуйста, это временное явление?

Подскажите, как я могу воспроизвести ошибку, не повредив Ваши данные?

> Ошибка 1.
>  
> MCU Step: state input
> -------------------------------------------------------------------
> -------
> mpirun has exited due to process rank 0 with PID 25327 on node n121 
> exiting improperly. There are three reasons this could
> occur:
>  
> 1. this process did not call "init" before exiting, but others in the 
> job did. This can cause a job to hang indefinitely while it waits for 
> all processes to call "init". By rule, if one process calls "init", 
> then ALL processes must call "init" prior to termination.
>  
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to 
> exiting or it will be considered an "abnormal termination"
>  
> 3. this process called "MPI_Abort" or "orte_abort" and the mca 
> parameter orte_create_session_dirs is set to false. In this case, the 
> run-time cannot detect that the abort call was an abnormal 
> termination. Hence, the only error message you will receive is this 
> one.
>  
> This may have caused other processes in the application to be 
> terminated by signals sent by mpirun (as reported here).
>  
> You can avoid this message by specifying -quiet on the mpirun command 
> line.
>  
>  
> Ошибка 2.
>  
>  
>  MCU Step: state input
> Warning: state input has already been finished. Restored.
>  
>  MCU Step: state calculation
>  
>   WARNINGS in initial data of MCU:           0
>   ERRORS   in initial data of MCU:           0
>  
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned a non-zero 
> exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> -------------------------------------------------------------------
> -------
> mpirun detected that one or more processes exited with non-zero 
> status, thus causing the job to be terminated. The first process to do 
> so was:
>  
>   Process name: [[54185,1],31]
>   Exit code:    2
>  
>  
>  
> [n121][[54185,1],15][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) 
> [n108][[54185,1],95][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) 
> [n113][[54185,1],63][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv]
> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) At 
> line 17112 of file MCUmpi.F90 (unit = 20, file =
> '/mnt/pool/2/rynatb/MCUPTR_10/PIN-GAP_BASOV/c2m6_62.039--16-
> BASOV_PG.MCU_P31')
> Fortran runtime error: Operation now in progress
>  
>  
> С уважением,
> Ринат
> 
> --
> Ринат Богданович
> Rynat Bahdanovich
>  
> Postgraduate student, assistant
> National Research Nuclear University "MEPhI"
> Department of Theoretical and Experimental Physics of Nuclear Reactors 
> (№5) Moscow, Russia, +7 (495) 788 56 99 (ext. 9364), +7 (925) 846 28 
> 14 RBBogdanovich at mephi.ru
>  
>  
> _______________________________________________
> hpc mailing list
> hpc at lists.mephi.ru
> https://lists.mephi.ru/listinfo/hpc
--
С уважением,
инженер отдела Unix-технологий МИФИ,
Аникеев Артём.
Тел.: 8
(495) 788-56-99, доб. 8998
_______________________________________________
hpc mailing list
hpc at lists.mephi.ru
https://lists.mephi.ru/listinfo/hpc


More information about the hpc mailing list