Fix segmentation fault with complex Variable operations and assign_add (#105367)#9
Open
CodersAcademy006 wants to merge 9 commits intomasterfrom
Open
Fix segmentation fault with complex Variable operations and assign_add (#105367)#9CodersAcademy006 wants to merge 9 commits intomasterfrom
CodersAcademy006 wants to merge 9 commits intomasterfrom
Conversation
Co-authored-by: CodersAcademy006 <104912634+CodersAcademy006@users.noreply.github.com>
Co-authored-by: CodersAcademy006 <104912634+CodersAcademy006@users.noreply.github.com>
…ents and edge case handling Co-authored-by: CodersAcademy006 <104912634+CodersAcademy006@users.noreply.github.com>
Fixes tensorflow#105367 The issue was that complex types (complex64, complex128) were missing from: 1. GPU DenseUpdate functor template instantiations for ADD/SUB operations 2. GPU kernel registrations for AssignAddVariableOp and AssignSubVariableOp This caused a segmentation fault when using assign_add on complex Variables, particularly when combined with tf.raw_ops.Conj operations. Changes: - Added TF_CALL_COMPLEX_TYPES to dense_update_functor_gpu.cu.cc for ADD/SUB - Added TF_CALL_COMPLEX_TYPES to GPU kernel registrations in resource_variable_ops.cc - Added comprehensive test cases for complex variable assign_add operations
633e17d to
8bb30b1
Compare
Address code review feedback: limit changes only to the complex variable conj segfault fix. The cuDNN batch splitting code was unrelated to the initializers issue and has been removed to keep the PR focused.
…ble fix The complex variable conj segfault fix only requires changes to: - dense_update_functor_gpu.cu.cc (GPU kernel instantiation) - resource_variable_ops.cc (GPU kernel registration) - resource_variable_ops_test.py (test cases) The conv_ops_impl.h file is unrelated to this fix and should not be modified.
24d000a to
a7b6d38
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes issue tensorflow#105367 - a segmentation fault that occurs when performing complex number operations involving:
Root Cause
The segmentation fault was caused by missing complex type support in:
dense_update_functor_gpu.cu.ccresource_variable_ops.ccWhen users tried to use
assign_addon complex variables (especially aftertf.raw_ops.Conj), the kernel was not properly instantiated for complex types on GPU, leading to undefined behavior and segfaults.Changes
tensorflow/core/kernels/dense_update_functor_gpu.cu.cc
TF_CALL_COMPLEX_TYPES(DEFINE_GPU_KERNELS)to instantiate DenseUpdate functors for complex64 and complex128 typestensorflow/core/kernels/resource_variable_ops.cc
TF_CALL_COMPLEX_TYPES(REGISTER_GPU_KERNELS)to register GPU kernels for AssignAddVariableOp and AssignSubVariableOp with complex typestensorflow/python/kernel_tests/variables/resource_variable_ops_test.py
testComplexVariableAssignAddWithConj: Tests GPU execution with Conj operation (the original issue)testComplexVariableAssignAddCPU: Tests CPU execution with complex typesTesting
The fix has been validated with:
Fixes
Closes tensorflow#105367