In moderately-sized PALS jobs, the tool sometimes attaches after the barrier even though cti_releaseAppBarrier was not yet called.
Attaching at barrier as expected:
[57.938218490]info stack
&"info stack\n"
~"#0 0x00002b99d2b92707 in kill () from /lib64/libc.so.6\n"
~"#1 0x00002b99d4971123 in pals_start_barrier (state=state@entry=0x2b99d3a0db00) at /workspace/rpmbuild/BUILD/cray-pals-1.1.3/src/libpals/libpals.c:843\n"
~"#2 0x00002b99d37ff64e in _pmi_pals_sync () at /workspace/src/pals/pals_utils.c:408\n"
~"#3 0x00002b99d37f6ab4 in _pmi_init (spawned=spawned@entry=0x7ffd06de0c1c) at /workspace/src/pmi_core/_pmi_init.c:1431\n"
~"#4 0x00002b99d37f74f4 in _pmi_constructor () at /workspace/src/pmi_core/_pmi_init.c:366\n"
~"#5 0x00002b99cf708aba in call_init.part () from /lib64/ld-linux-x86-64.so.2\n"
~"#6 0x00002b99cf708bc6 in _dl_init () from /lib64/ld-linux-x86-64.so.2\n"
~"#7 0x00002b99cf6f9eda in _dl_start_user () from /lib64/ld-linux-x86-64.so.2\n"
~"#8 0x0000000000000002 in ?? ()\n"
~"#9 0x00007ffd06de261e in ?? ()\n"
~"#10 0x00007ffd06de263e in ?? ()\n"
~"#11 0x0000000000000000 in ?? ()\n"
Attaching at MPI_Init after barrier:
[58.538222668]info stack
&"info stack\n"
~"#0 0x00002b10b6ff64eb in _pmi_smp_barrier_join (smp_bar=0x2b10b7218310, restrict_to_app=restrict_to_app@entry=0) at /workspace/src/pmi_core/smp_barrier.c:81\n"
~"#1 0x00002b10b6fee137 in _pmi_barrier (bar_tag=bar_tag@entry=BARRIER_PACKET, restrict_to_app=restrict_to_app@entry=0) at /workspace/src/pmi_core/_pmi_barrier.c:50\n"
~"#2 0x00002b10b6ff90d1 in PMI_Barrier () at /workspace/src/api/coll/pmi_barrier.c:27\n"
~"#3 0x00002b10b6ff9977 in PMI2_Init (spawned=0x7ffc4ecea6a0, size=0x7ffc4ecea6a8, rank=0x7ffc4ecea6a4, appnum=0x7ffc4ecea6ac) at /workspace/src/api/misc/pmi_init.c:182\n"
~"#4 0x00002b10b577bd41 in MPIR_pmi_init () from /opt/cray/pe/lib64/libmpi_gnu_91.so.12\n"
~"#5 0x00002b10b5780f76 in MPID_Init () from /opt/cray/pe/lib64/libmpi_gnu_91.so.12\n"
~"#6 0x00002b10b3cec96d in MPIR_Init_thread () from /opt/cray/pe/lib64/libmpi_gnu_91.so.12\n"
~"#7 0x00002b10b3cec744 in PMPI_Init () from /opt/cray/pe/lib64/libmpi_gnu_91.so.12\n"
~"#8 0x000000000040138d in main (argc=2, argv=0x7ffc4ecea908)\n"
The call to cti_releaseAppBarrier occurred at 59.749728688 in this run.
In moderately-sized PALS jobs, the tool sometimes attaches after the barrier even though
cti_releaseAppBarrierwas not yet called.Attaching at barrier as expected:
Attaching at
MPI_Initafter barrier:The call to
cti_releaseAppBarrieroccurred at 59.749728688 in this run.