Skip to content

Add fltflt rounding and fmod functions#1129

Merged
tbensonatl merged 6 commits intomainfrom
feature/add-fltflt-round-fmod
Mar 2, 2026
Merged

Add fltflt rounding and fmod functions#1129
tbensonatl merged 6 commits intomainfrom
feature/add-fltflt-round-fmod

Conversation

@tbensonatl
Copy link
Copy Markdown
Collaborator

Add support for the following float-float (fltflt) functions:

  • Round toward nearest, with ties toward even
  • Truncate toward zero
  • Truncate toward negative infinity
  • fmod (floating point remainder)

Also includes are new unit tests and benchmarks for the newly introduced functions.

Also add fltflt_add_same_sign(), with is more efficient than fltflt_add()
for the case where we know both inputs have the same sign.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl tbensonatl self-assigned this Feb 27, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 27, 2026

Greptile Summary

This PR adds four new float-float arithmetic functions: round-to-nearest (with ties to even), truncate toward zero, floor (truncate toward negative infinity), and fmod (floating-point remainder). The implementation includes proper edge case handling with zero-division guards returning NaN, consistent use of fabsf for float operations, and comprehensive test coverage.

Key changes:

  • Added fltflt_round_to_nearest(), fltflt_round_toward_zero(), fltflt_floor(), and fltflt_fmod() functions
  • Added optimized fltflt_add_same_sign() for same-sign operands
  • Made fltflt constructors constexpr for compile-time evaluation
  • Comprehensive unit tests covering positive/negative values, zero divisors, and high-precision cases
  • Performance benchmarks for all new functions

Previous feedback addressed:
All issues from previous review threads have been resolved - zero-division guards are in place, fabsf is used consistently, and comprehensive tests have been added for fmod.

Confidence Score: 5/5

  • This PR is safe to merge with no identified issues
  • All previous feedback has been addressed. The implementation includes proper edge case handling, comprehensive test coverage (12+ test cases for fmod alone), performance benchmarks, and follows consistent coding patterns. The code uses appropriate precision functions (fabsf), guards against division by zero, and includes detailed documentation.
  • No files require special attention

Important Files Changed

Filename Overview
include/matx/kernels/fltflt.h Adds four new rounding/fmod functions with proper zero-division guards and consistent use of fabsf. All previous feedback addressed.
test/00_misc/FloatFloatTests.cu Comprehensive test coverage for all new functions including edge cases (negative values, zero divisor, high-precision cases).
bench/00_misc/fltflt_arithmetic.cu Adds performance benchmarks for new rounding and fmod functions with appropriate test cases.
bench/scripts/run_fltflt_benchmarks.py Updates benchmark list to include newly added functions (round, trunc, floor, fmod, cast operations).

Last reviewed commit: 7a37881

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment thread bench/00_misc/fltflt_arithmetic.cu Outdated
Comment thread include/matx/kernels/fltflt.h
@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

1 similar comment
@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

Comment thread include/matx/kernels/fltflt.h
Comment thread include/matx/kernels/fltflt.h
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread test/00_misc/FloatFloatTests.cu
- Add fltflt_fmod unit tests
- Updated fltflt_fmod to return {NaN, NaN} in the case of a zero divisor

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl
Copy link
Copy Markdown
Collaborator Author

/build

@tbensonatl tbensonatl merged commit dd01c29 into main Mar 2, 2026
1 check passed
@tbensonatl tbensonatl deleted the feature/add-fltflt-round-fmod branch March 6, 2026 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants