Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions xprof/xprof.rb.in
Original file line number Diff line number Diff line change
Expand Up @@ -251,11 +251,15 @@ module MPITopo
number_hostname, hostname_index = if hostfile.nil?
[10_000, rank_id / local_size]
else
# splitting by "." in case the hostfile contains full hostnames with address
# (e.g. x4117c4s4b0n0.hsn.cm.aurora.alcf.anl.gov) and the Socket.gethostname is just x4117c4s4b0n0
hostnames = File.readlines(hostfile).map { |el| el.split('.', 2).first }
hostnames = File.readlines(hostfile)
# find index of hostname_string in list_hostnames
hostname_id = hostnames.find_index(Socket.gethostname)
hostname_id = hostnames.find_index{ |host| host.start_with?(Socket.gethostname) }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can do it in online now :D
File.readlines(hostfile).find_index{ |hostname| hostname.start_with?(Socket.gethostname) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, we need the hostnames to get the number_hostname. We could make this change but then we'd need to read the hostfile again to get the number of hosts.

Copy link
Collaborator

@TApplencourt TApplencourt Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh true, my bad then! Didn't read enough of the diff :D

# without "start_with" this could fail if something happens like on sunspot
# where Socket.gethostname returns x1921c1s2b0n0 but the hostfile contains
# x1921c1s2b0n0-hsn0.hsn.cm.sunspot.alcf.anl.gov.
if hostname_id.nil?
raise "Trying to find the index of each hostname in the hostfile list did not work."
end
# compute the offset, based on splitting MAX_UINT64_VALUE by the number of hosts
[hostnames.length, hostname_id]
end
Expand Down