Skip to content

Co_broadcast_cptr#2

Closed
bonachea wants to merge 6 commits into
mainfrom
co_broadcast_cptr
Closed

Co_broadcast_cptr#2
bonachea wants to merge 6 commits into
mainfrom
co_broadcast_cptr

Conversation

@bonachea
Copy link
Copy Markdown
Owner

@bonachea bonachea force-pushed the co_broadcast_cptr branch from 1ade4f5 to 1a27941 Compare March 18, 2026 21:23
@bonachea bonachea changed the base branch from co_reduce_pdt to main March 19, 2026 03:07
@bonachea bonachea force-pushed the co_broadcast_cptr branch from 1a27941 to 37aa64a Compare March 21, 2026 03:55
@bonachea bonachea force-pushed the co_broadcast_cptr branch 2 times, most recently from 3c116f1 to ad1daf7 Compare April 20, 2026 03:49
@bonachea
Copy link
Copy Markdown
Owner Author

@certik while you're fixing LFortran bugs impacting Caffeine, here's another branch for an orthogonal forthcoming Caffeine feature that also fails (at runtime) with LFortran 0.62 and latest. It's getting a SEGV inside broadcast_derived_type() in test/prif_co_broadcast_test.F90

Here is a crash stack from LFortran 0.62:

[0] #13 0x000077552cc288ff in __GI_abort () at ./stdlib/abort.c:79
[0] #14 0x000077552cc297b6 in __libc_message_impl (fmt=fmt@entry=0x77552cdce8d7 "%s\n") at ../sysdeps/posix/libc_fatal.c:134
[0] #15 0x000077552cca8ff5 in malloc_printerr (str=str@entry=0x77552cdd1ac0 "double free or corruption (out)") at ./malloc/malloc.c:5775
[0] #16 0x000077552ccab120 in _int_free_merge_chunk (av=0x77552ce03ac0 <main_arena>, p=0x60297a7c4040, size=105731264887424) at ./malloc/malloc.c:4676
[0] #17 0x000077552ccaddce in __GI___libc_free (mem=0x60297a7c4050) at ./malloc/malloc.c:3398
[0] #18 0x0000602961e898f5 in finalize_StructType__object_t_of_prif_co_broadcast_test_m ()
[0] #19 0x0000602961e87527 in __module_prif_co_broadcast_test_m_broadcast_derived_type ()
[0] #20 0x0000602962011872 in __module_julienne_test_description_m_run ()
[0] #21 0x0000602962036ba2 in __module_julienne_test_m_run ()
[0] #22 0x0000602961e87cba in __module_prif_co_broadcast_test_m_results ()
[0] #23 0x0000602962033c98 in __module_julienne_test_m_report ()
[0] #24 0x000060296202d93e in __module_julienne_test_fixture_m_report ()
[0] #25 0x000060296202e82b in __module_julienne_test_harness_m_report_results ()
[0] #26 0x0000602961e69e7b in main ()

@certik
Copy link
Copy Markdown

certik commented Apr 20, 2026

I can confirm that the previous main (328531f62f3d5c50b575003f936d0853712d8c1b) indeed failed, for me it fails with:

PRIF Coarray Inquiries
*** Caught a fatal signal (proc 0): SIGABRT(6)
*** NOTICE (proc 0): Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
*** NOTICE (proc 0): We recommend linking the debug version of GASNet to assist you in resolving this application issue.
<ERROR> Execution for object " julienne-driver " returned exit code  6
<ERROR> *cmd_run*:stopping due to failed executions
STOP 6

The latest LFortran main (e9c464d2521a3bb5225c47e7752d91b65c2fdec6) compiles and runs all tests for this.

So I think this is now fixed thanks to lfortran/lfortran#11181.

@bonachea
Copy link
Copy Markdown
Owner Author

bonachea commented Apr 20, 2026

The latest LFortran main (e9c464d2521a3bb5225c47e7752d91b65c2fdec6) compiles and runs all tests for this.

So I think this is now fixed thanks to lfortran/lfortran#11181.

@certik The failing CI run I linked above used lfortran:latest from about 30 minutes ago, which I think already includes your changes from lfortran/lfortran#11181 . I'm re-running it again now to be sure.

@certik
Copy link
Copy Markdown

certik commented Apr 20, 2026

Ok, we'll have to figure out how to extract the MRE for this. I made this lfortran/lfortran#11191.

@bonachea
Copy link
Copy Markdown
Owner Author

@certik: Failed again just now with lfortran:latest:

  Digest: sha256:33b503f927b869a889d9885be8330ba8485082861fa51c22b503e33ff27f7040
  Status: Downloaded newer image for ghcr.io/lfortran/lfortran:latest
  ghcr.io/lfortran/lfortran:latest
...
+ lfortran --version
LFortran version: 0.62.0-234-ge9c464d25
Status: alpha (expected to fail on third-party codes)
Platform: Linux
LLVM: 11.1.0
Default target: x86_64-unknown-linux-gnu
LSP Version: 3.17.0
JSON_RPC Version: 2.0
...
prif_co_broadcast
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
*** Caught a fatal signal (proc 6): SIGABRT(6)

@bonachea bonachea force-pushed the co_broadcast_cptr branch 6 times, most recently from 6858cf0 to a0ee388 Compare April 24, 2026 04:37
@bonachea
Copy link
Copy Markdown
Owner Author

bonachea commented Apr 29, 2026

The original code where the LFortran defect arose is archived in this branch:

https://github.com/bonachea/caffeine/tree/refs/heads/co_broadcast_cptr-lfortran1

This pull request will advance by deploying a workaround that skips the failing test on LFortran, at least for now.

@bonachea bonachea force-pushed the co_broadcast_cptr branch 3 times, most recently from 882af9b to 434ba42 Compare May 2, 2026 03:37
bonachea added 6 commits May 4, 2026 19:28
This particular part of the test code is unreachable in older
versions of GFortran.
Add `sequence` to derived types used in PRIF's contiguous communication
calls, to ensure a flat linear storage layout for use in communicating the
raw storage sequence.
@bonachea bonachea force-pushed the co_broadcast_cptr branch from 434ba42 to 61121bc Compare May 5, 2026 02:28
@bonachea
Copy link
Copy Markdown
Owner Author

bonachea commented May 5, 2026

Superseded by BerkeleyLab#319

@bonachea bonachea closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants