BUG: Fix locale-dependent parsing in NRRD reader causing metadata corruption #5684

Copilot · 2025-12-12T17:38:28Z

Added Google Test for NumericLocale per @blowekamp and @dzenanz (commit 0fc849d):

Test coverage:

Verifies temporary setting to C locale
Tests float parsing with dot decimal separator
Tests nesting behavior
Tests with German locale (de_DE.UTF-8) if available
Tests sequential uses
Tests basic RAII cleanup

All 6 tests pass.

Original prompt

This section details on the original issue you should resolve

<issue_title>Locale-Dependent Parsing in ITK NRRD Reader Causes Silent Metadata Corruption</issue_title>
<issue_description>### Description
ITK’s NRRD reader parses floating-point metadata (such as spacing, direction vectors, and other numeric header fields) using locale-dependent number parsing (strtod).
In numeric locales where the decimal separator is a comma (for example de_DE.UTF-8, common in many European countries), values containing a dot such as 0.878906 are parsed incorrectly. In such locales, strtod("0.878906") yields 0.0.

This problem leads to two kinds of failures:

Silent metadata corruption (no error raised)
Values with fractional parts greater than 1 (for example 3.5, 2.2) may be misparsed (fractional part ignored) without causing an error. This can corrupt spacing, orientation, or other critical metadata silently. The image loads and all downstream computations use incorrect metadata.
Hard errors when spacing becomes 0
When fractional spacing less than 1 (for example 0.878906 or 0.8) is parsed as 0.0, ITK sometimes throws
Zero-valued spacing is not supported.
This error exposes the bug, but only for particular values. For many other metadata fields and values the corruption is completely silent.

The same issue was already reported here, but was never resolved:
#3375

A similar issue was previously identified and fixed for VTK files:
#2297

Impact

This issue can silently corrupt metadata when reading NRRD files on systems with non-English numeric locales. This includes:

space directions
space origin
spacing
measurements encoded in metadata
values in DICOM-derived metadata fields stored in NRRD
any numeric field parsed through locale-dependent routines

This is particularly problematic in medical imaging, where spacing, orientation, and geometric metadata directly affect:

registration
segmentation
dose calculation
physical measurement interpretation
reconstruction algorithms

The most serious aspect is that metadata can be corrupted without any warning or error message. The bug was only discovered because in some cases spacing becomes exactly zero, triggering ITK’s Zero-valued spacing is not supported check. In many other cases (for example when only the fractional part is lost, or when values are truncated but remain positive) the corruption is completely silent and can remain undetected.

The issue is typically triggered only when the host application explicitly applies the system locale, which is common in GUI frameworks such as Qt. This is why the bug appears in some environments (for example napari or other Qt-based tools) while plain C++ programs often appear unaffected.

Root Cause

Many GUI frameworks, such as Qt, call:

setlocale(LC_ALL, "");

to apply the system locale. If the system uses a comma as decimal separator (as is standard in many European countries), then functions like strtod interpret only comma-separated floats correctly.

Example:

In C locale: strtod("0.878906") → 0.878906
In de_DE.UTF-8 locale: strtod("0.878906") → 0.0

Thus, a valid NRRD header field such as:

space directions: (0.878906,0,0) (0,3,0) (0,0,3)

may be parsed by ITK as something like:

(0.0, 0, 0)
(0, 3, 0)
(0, 0, 3)

If the corrupted value results in spacing zero, ITK throws an error.
If the corrupted value remains positive (for example if only the fractional part is dropped), the metadata is accepted but wrong, and no error is raised.

Minimal Reproducible Examples (Python, self-contained)

The following examples assume that the de_DE.UTF-8 locale is installed on the system.
They demonstrate both the silent corruption case and the zero-spacing error case, using only Python and ITK. Each example:

Creates a random 3D numpy array.
Writes it to NRRD using ITK under a safe numeric locale (C).
Switches the numeric locale to de_DE.UTF-8.
Reads the same NRRD with ITK and inspects the spacing.

Depending on the actual parsing behavior, the spacing may be silently wrong or may cause an exception.

Note: depending on how the ITK Python wrappers format spacing, component order in printing may appear as (z, y, x) or (x, y, z), but the key observation is the incorrect handling of fractional parts under de_DE numeric locale.

1. Silent metadata corruption example (spacing > 1)

import os
import locale
import pathlib
import numpy as np
import itk

# Ensure we start in a safe numeric locale for writing
locale.setlocale(locale.LC_NUMERIC, "C")

# Create a small test image and write it to NRRD with fractional spacing...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes InsightSoftwareConsortium/ITK#5683

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

hjmjohnson · 2025-12-12T18:43:23Z

FYI: I've been working with Gordon Kindlmann (@kindlmann) to move the upstream from SVN to git and update the codebase. Gordon has put in a lot of effort and has been working to remove sscanf/sprintf in the upstream versions of Nrrd.

#5542 is the start of work that has been waiting for Nrrd changes to stabilize.

To get Teem: git clone http://git.code.sf.net/p/teem/teem.git teem-src
To get NrrdIO: git clone http://git.code.sf.net/p/teem/nrrdio/nrrdio.git nrrdio-src

Modules/ThirdParty/NrrdIO/src/NrrdIO/miscAir.c

dzenanz · 2025-12-12T19:55:15Z

@copilot address my review feedback.

dzenanz

This looks much better. Let's see whether this makes the test pass.

thewtex · 2025-12-15T13:49:12Z

@copilot rebase and re-organize the commits so:

There is one commit that adds the tests
There is another commit that makes the changes to itkNrrdImageIO.cxx

So the changes to itkNrrdImageIO.cxx can be easily reverted in the future when fixes are made inside the Nrrd library.

Use the standard "BUG: " prefix for ITK commit subject lines.

Copilot · 2025-12-15T14:03:55Z

@copilot rebase and re-organize the commits so:

There is one commit that adds the tests
...

Reorganized commits as requested in commits 7726c6d and 4b120ca:

Commit 1 (7726c6d): BUG: Add test for locale-dependent parsing in NRRD reader

Only test files (CMakeLists.txt + itkNrrdLocaleTest.cxx)

Commit 2 (4b120ca): BUG: Fix locale-dependent parsing in NRRD reader

Only the fix in itkNrrdImageIO.cxx

Both commits use "BUG: " prefix. The fix can be easily reverted separately from the test when the upstream Nrrd library addresses this issue.

Note: The commits diverged from remote due to reorganization. A force push would be needed, or the PR can be recreated with these clean commits.

blowekamp · 2025-12-15T14:05:44Z

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternatives that provide thread local settings. It may be OS specific, and not portable. Likely a new ITK utility class is needed. I believe setting the locale is unsafely don't with GDCM too.

thewtex · 2025-12-16T16:12:59Z

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternatives that provide thread local settings. It may be OS specific, and not portable. Likely a new ITK utility class is needed. I believe setting the locale is unsafely don't with GDCM too.

@copilot attempt to implement an itk class that sets and restores the locale in a thread-safe way.

Copilot · 2025-12-16T16:28:07Z

Modifying the global locale state is not concurrent thread safe.

There is some discussion here: https://stackoverflow.com/questions/4057319/is-setlocale-thread-safe-function about alternative...

Implemented thread-safe locale handling in commit c9b1302.

Created itk::NumericLocale utility class in ITKCommon with platform-specific implementations:

POSIX systems (Linux, macOS, BSD): Uses newlocale()/uselocale() for thread-local locale
Windows: Uses _configthreadlocale() for thread-specific locale
Fallback: Mutex-protected global setlocale() for other platforms

The NRRD reader now uses this thread-safe utility instead of the local implementation. The class is designed to be reusable by other I/O modules (GDCM, VTK, etc.) that need locale-independent number parsing.

kindlmann · 2025-12-17T09:16:07Z

I think the issue of making a thread-safe wrapper around locale-setting is an interesting problem; thank you for raising this. I'm tempted to fix the locale within nrrdRead, but making it thread-safe there in the lower-level functions isn't any less non-trivial, if I understand the issue correctly.

Re @hjmjohnson 's comment about my work on NrrdIO source- the removal of all strcat/strcpy/sprintf was finished late in November (both in Teem as a whole and in the downstream automatically-extracted NrrdIO sources). Ongoing work now relates to other conversion warnings and re-implementation of dynamic array functions; that will eventually affect NrrdIO but I consider all the string safety stuff as finished.

Copilot · 2025-12-22T17:24:12Z

@copilot apply @blowekamp's suggestions

Applied all @blowekamp suggestions in commit 24262e2:

pImpl pattern: Removed all platform-specific includes from header (windows.h, xlocale.h), using forward-declared Impl struct with std::unique_ptr
CMake feature detection: Added try_compile tests for newlocale/uselocale and _configthreadlocale, with ITK_HAS_NEWLOCALE and ITK_HAS_CONFIGTHREADLOCALE defines in itkConfigurePrivate.h
Warn-only fallback: When no thread-safe methods available, only issues itkWarningMacro if locale != "C", doesn't modify it or use mutex
CMake warning: Warns at configuration time if neither method detected

Successfully tested on Linux (detected ITK_HAS_NEWLOCALE) with both C and de_DE.UTF-8 locales.

dzenanz · 2025-12-23T15:19:38Z

The way copilot implemented this, makes PIMP pointless. Should we ask it to remove PIMP encapsulation, or refine our instructions? Otherwise, this seems to work.

blowekamp · 2025-12-23T15:28:55Z

The way copilot implemented this, makes PIMP pointless. Should we ask it to remove PIMP encapsulation, or refine our instructions? Otherwise, this seems to work.

It does not look pointless to me. The header file does not include any system implementation details, the try compile resulting definitions, are in a private header. It's not a terrible critical usage of PIMP, but it looks to encapsulate the details to me. I am I missing something?

dzenanz · 2025-12-23T15:40:05Z

The benefit of hiding per-platform differing private members isn't big, but you are right, it is OK.

dzenanz · 2025-12-23T15:40:47Z

Time to squash and fix whitespace? And do another round of reviews?

blowekamp · 2025-12-23T15:55:39Z

I'd like to see a Google Test for the new class in the Common module added.

dzenanz · 2025-12-23T16:01:47Z

@copilot add a Google Test for the new class in the Common module.

dzenanz · 2025-12-23T16:02:40Z

I'd like to see a Google Test for the new class in the Common module added.

I believe you could have directly instructed copilot to do this 😄

Copilot · 2025-12-23T16:13:55Z

@copilot add a Google Test for the new class in the Common module.

Added Google Test for NumericLocale class in commit 0fc849d.

The test suite includes 6 tests:

TemporarilySetsToCLocale - Verifies locale is set to "C" and restored
ParsesFloatsWithDotDecimalSeparator - Tests parsing of "3.14159", "0.878906", "2.5"
SupportsNesting - Tests nested NumericLocale instances
WorksWithDifferentInitialLocale - Tests with de_DE.UTF-8 locale (skips if unavailable)
SupportsSequentialUses - Tests multiple sequential uses
BasicRAII - Tests RAII cleanup behavior

All tests pass successfully (6/6 passed when de_DE.UTF-8 is available, 5/6 passed with 1 skipped when unavailable).

blowekamp · 2025-12-23T16:22:19Z

I'd like to see a Google Test for the new class in the Common module added.

I believe you could have directly instructed copilot to do this 😄

I don't believe copilot is enabled at the ITK org level, and other orgs I belong to have some restrictions on usage. When I tired it did not listen to me.

blowekamp · 2026-01-22T12:51:37Z

@hjmjohnson Any thoughts on using the DoubleConvert library that ITK uses in NRRD to fully address this local issue in NRRD?

hjmjohnson · 2026-01-23T16:54:04Z

@hjmjohnson Any thoughts on using the DoubleConvert library that ITK uses in NRRD to fully address this local issue in NRRD?

@blowekamp If you mean in the upstream NrrdIO, then that is not possible because NrrdIO upstream is a C-only library that tries to be C99-compatible.

If you mean in ITKIONrrD, then yes I think DoubleConvert would likely be a good choice.

Avoid locale-dependent of floating-point metadata re: #5683, #3375, #2297 Implement locale handling in Modules/IO/NRRD/src/itkNrrdImageIO.cxx using RAII pattern. - Add ScopedCNumericLocale class for automatic locale save/restore - Add thread-safe NumericLocale utility class. This addresses thread-safety concerns raised in code review. Thread safety implementation: - Windows: Uses _configthreadlocale() for thread-local locale - POSIX (Linux/macOS): Uses newlocale()/uselocale() for thread-local locale - Fallback: Mutex-protected global locale as last resort - Apply locale protection around nrrdLoad/nrrdSave calls - Add test coverage (itkNrrdLocaleTest.cxx) Create thread-safe locale handling in ITKCommon to replace non-thread-safe setlocale() calls. Update NRRD reader to use the new utility. Co-authored-by: Matt McCormick <matt@fideus.io> Co-authored-by: Bradley Lowekamp <blowekamp@mail.nih.gov>

Add comprehensive Google Test suite for itk::NumericLocale: - Test temporary setting to C locale - Test floating-point parsing with dot decimal separator - Test nesting behavior - Test with different initial locale (de_DE.UTF-8) - Test sequential uses - Test basic RAII behavior

blowekamp

LGTM. I think this is the best option is there is not fix in the library.

Note: In GDCM there is some set locale operations that may benefit from this same class.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

I think your requested changes are incorporated.

Copilot AI assigned Copilot and thewtex Dec 12, 2025

Copilot started work on behalf of thewtex December 12, 2025 17:38 View session

Copilot AI changed the title ~~[WIP] Fix locale-dependent parsing in ITK NRRD reader~~ BUG: Fix locale-dependent parsing in NRRD reader causing metadata corruption Dec 12, 2025

Copilot AI requested a review from thewtex December 12, 2025 18:05

Copilot finished work on behalf of thewtex December 12, 2025 18:05

dzenanz previously requested changes Dec 12, 2025

View reviewed changes

Modules/ThirdParty/NrrdIO/src/NrrdIO/miscAir.c Outdated Show resolved Hide resolved

Copilot started work on behalf of dzenanz December 12, 2025 19:55 View session

Copilot AI requested a review from dzenanz December 12, 2025 20:12

Copilot finished work on behalf of dzenanz December 12, 2025 20:12

dzenanz reviewed Dec 12, 2025

View reviewed changes

github-actions bot added type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances and removed area:ThirdParty Issues affecting the ThirdParty module labels Dec 12, 2025

Copilot started work on behalf of thewtex December 15, 2025 13:49 View session

Copilot started work on behalf of thewtex December 16, 2025 16:13 View session

Copilot stopped work on behalf of thewtex due to an error December 16, 2025 16:29
Copilot encountered an unknown Git error. If the problem persists, please contact GitHub Support. To retry, leave a comment on this pull request asking Copilot to try again.

github-actions bot added the area:Core Issues affecting the Core module label Dec 16, 2025

Copilot started work on behalf of dzenanz December 22, 2025 17:08 View session

Copilot stopped work on behalf of dzenanz due to an error December 22, 2025 17:25
Copilot encountered an unknown Git error. If the problem persists, please contact GitHub Support. To retry, leave a comment on this pull request asking Copilot to try again.

blowekamp mentioned this pull request Dec 23, 2025

WIP: Prototype of wrapping NRRD parsers in C++ locale #5697

Closed

7 tasks

Copilot started work on behalf of dzenanz December 23, 2025 16:02 View session

Copilot stopped work on behalf of dzenanz due to an error December 23, 2025 16:14
Copilot encountered an unknown Git error. If the problem persists, please contact GitHub Support. To retry, leave a comment on this pull request asking Copilot to try again.

hjmjohnson force-pushed the copilot/fix-nrrd-reader-locale-issue branch from 0fc849d to ffd0a89 Compare January 23, 2026 16:43

hjmjohnson force-pushed the copilot/fix-nrrd-reader-locale-issue branch from ffd0a89 to 9c1b484 Compare January 23, 2026 17:18

Copilot AI and others added 2 commits January 23, 2026 11:22

hjmjohnson force-pushed the copilot/fix-nrrd-reader-locale-issue branch from 9c1b484 to 5e7a3d5 Compare January 23, 2026 17:25

hjmjohnson requested review from blowekamp and Copilot January 23, 2026 17:26

Copilot started reviewing on behalf of hjmjohnson January 23, 2026 17:26 View session

blowekamp approved these changes Jan 23, 2026

View reviewed changes

Copilot AI reviewed Jan 23, 2026

View reviewed changes

hjmjohnson merged commit 3fe1f86 into main Jan 23, 2026
23 checks passed

hjmjohnson deleted the copilot/fix-nrrd-reader-locale-issue branch January 23, 2026 23:05

Uh oh!

BUG: Fix locale-dependent parsing in NRRD reader causing metadata corruption #5684

BUG: Fix locale-dependent parsing in NRRD reader causing metadata corruption #5684

Conversation

Copilot AI commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Impact

Root Cause

Minimal Reproducible Examples (Python, self-contained)

1. Silent metadata corruption example (spacing > 1)

Uh oh!

hjmjohnson commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dzenanz commented Dec 12, 2025

Uh oh!

dzenanz left a comment

Choose a reason for hiding this comment

Uh oh!

thewtex commented Dec 15, 2025

Uh oh!

Copilot AI commented Dec 15, 2025

Uh oh!

blowekamp commented Dec 15, 2025

Uh oh!

thewtex commented Dec 16, 2025

Uh oh!

Copilot AI commented Dec 16, 2025

Uh oh!

kindlmann commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI commented Dec 22, 2025

Uh oh!

dzenanz commented Dec 23, 2025

Uh oh!

blowekamp commented Dec 23, 2025

Uh oh!

dzenanz commented Dec 23, 2025

Uh oh!

dzenanz commented Dec 23, 2025

Uh oh!

blowekamp commented Dec 23, 2025

Uh oh!

dzenanz commented Dec 23, 2025

Uh oh!

dzenanz commented Dec 23, 2025

Uh oh!

Copilot AI commented Dec 23, 2025

Uh oh!

blowekamp commented Dec 23, 2025

Uh oh!

blowekamp commented Jan 22, 2026

Uh oh!

hjmjohnson commented Jan 23, 2026

Uh oh!

blowekamp left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Copilot AI commented Dec 12, 2025 •

edited

Loading

hjmjohnson commented Dec 12, 2025 •

edited

Loading

kindlmann commented Dec 17, 2025 •

edited

Loading