Use convert_to_tensor inside __init__ to avoid eager-only numpy() under XLA by CodersAcademy006 · Pull Request #2 · CodersAcademy006/tensorflow

CodersAcademy006 · 2025-11-27T14:12:03Z

Replaces tf.constant(...) with tf.convert_to_tensor(...) inside init in example/test code so these classes work correctly with @tf.function(jit_compile=True). Fixes part of tensorflow#105151.

PiperOrigin-RevId: 834674902

Imported from GitHub PR openxla/xla#34103 📝 Summary of Changes "mandatory" compatible layouts have to be assigned to both operands and outputs simultaneously such that subsequent layout propagation does not alter one of them making the operation invalid. 🚀 Kind of Contribution 🐛 Bug Fix 🧪 Unit Tests: yes 🧪 Execution Tests: no Copybara import of the project: -- f0ff62e4bf031a3aebf4cdadb66139b3b1120307 by Ilia Sergachev <isergachev@nvidia.com>: [GPU] Fix layout assignment of bitcast-converts. "mandatory" compatible layouts have to be assigned to both operands and outputs simultaneously such that subsequent layout propagation does not alter one of them making the operation invalid. Merging this change closes tensorflow#34103 PiperOrigin-RevId: 834676734

…intrinsics. PiperOrigin-RevId: 834684558

…parameter``` is true. Old GSPMD propagation needs them since they do not have the concept of open/closed sharding. In Shardy with sdy-round-trip, JAX creates the correct open/closed shardings for parameters and results. We do not need these vectors at all. Before this change, we always canonicalize layout at after Shardy propagaiton, which may be redundant. PiperOrigin-RevId: 834684933

PiperOrigin-RevId: 834691273

@xla-rotation

…tions Imported from GitHub PR openxla/xla#32053 📝 Summary of Changes Added CommandBuffer support for Convolution ops graph capture of convolutions is enabled only for convolution custom call targets explicitly added to '--legacy_command_buffer_custom_call_targets' list: see command_buffer_scheduling_test.cc for an example. 🎯 Justification This op wase missing for whatever reason: this results in graph fragmentation especially for large models. Hence one gets several (sometimes many) execution graphs instead of just one. 🚀 Kind of Contribution ✨ New Feature 🧪 Unit Tests: Added new subtest to xla/service/gpu/transforms/command_buffer_scheduling_test.cc This is a splitted PR originated from openxla/xla#30855 @xla-rotation could you have a look please ? Copybara import of the project: -- 5f5f5bc8ba8212ceb6afde6f9729ba4a951e4051 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: adding coll permute and convolution to command buffers added UTs and convolution command test fixes added rebase fixes capture only those convolution targets which are explictly Revert "adding coll permute and convolution to command buffers" This reverts commit 75847e67261b4589162411c9846ed9c0b9fc1ed5. added conv to command buffers fixing build and test -- e8afa3296a4a8ad079cde2e84391c7e0006ddf52 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: fixing build -- b529288221708e59a04bfaacdfa5b7a1c25b091e by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: rewritten ConvolutionCmd, adapted command_buffer_conv_pass -- 3ecd3d0516a7766d70d636b2110b4b310a9be7b2 by Pavel Emeliyanenko <pavel.emeliyanenko@amd.com>: some cosmetics Merging this change closes tensorflow#32053 PiperOrigin-RevId: 834708288

PiperOrigin-RevId: 834762307

PiperOrigin-RevId: 834776465

…numbers from a structured MLIR attribute. The `ParseDimensionNumbers` function now expects a single `dimension_numbers` attribute in the composite attributes, structured as an array of arrays representing contracting and batch dimensions. This change simplifies the attribute parsing. Additionally, checks are added to ensure that the parsed dimension numbers are supported and to handle cases where scale factors are not divisible by 32 for non-BF16 types, preventing rewriting in those scenarios. The test case is updated to reflect the new attribute format. PiperOrigin-RevId: 834777398

…rror flakes. They may be causing permission errors on Windows when Bazel tries to access header files in Windows SDK/Clang while building @@bazel_tools targets. PiperOrigin-RevId: 834792776

In LiteRT, input and output names shouldn't be empty. Populate default names if tensors don't have names. PiperOrigin-RevId: 834808503

PiperOrigin-RevId: 834837889

PiperOrigin-RevId: 834839651

…ation model for collective-permute uses transfer size and communication pattern type as input key. The interpolation is implemented as follows: - `CollectivePermuteCostModelType` which classifies communication patterns (e.g. one-way, two-way-all-mutual) is added to `ExactInterpolatorKey` for collective-permute instructions. - Interpolation for collective-permute uses only exact matching via `ExactInterpolator` based on transfer size, and does not use `FallbackInterpolator`. This is because cost of collective-permute is primarily dependent on bytes transferred and communication pattern, and not on number of devices in the same way as other collectives. PiperOrigin-RevId: 834846468

PiperOrigin-RevId: 834846503

Updates LLVM usage to match [355e0f94af5a](llvm/llvm-project@355e0f94af5a) PiperOrigin-RevId: 834865231

…rvice New namespace is absl_testing:: PiperOrigin-RevId: 834872277

When the call splitter is called on a non-flat graph, we don't want to implicitly flatten it by creating new bodies at each callsite. PiperOrigin-RevId: 834875376

PiperOrigin-RevId: 834896529

PiperOrigin-RevId: 834972320

Fixes a problem where structured_tensor fails to import under Python 3.14 with the error: ``` File ".../tensorflow/python/ops/structured/structured_tensor.py", line 54, in <module> class StructuredTensor(extension_type.BatchableExtensionType): ...<1135 lines>... return self._ragged_shape.rank File ".../tensorflow/python/framework/extension_type.py", line 90, in __init__ _check_field_annotations(cls) ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^ File ".../tensorflow/python/framework/extension_type.py", line 935, in _check_field_annotations raise ValueError( ...<2 lines>... ) ValueError: The field annotations for StructuredTensor are invalid. Field FieldName is missing a type annotation. ``` PiperOrigin-RevId: 834982237

PiperOrigin-RevId: 834982495

PiperOrigin-RevId: 834997967

…host transfer extension. PiperOrigin-RevId: 835038444

PiperOrigin-RevId: 835066344

PiperOrigin-RevId: 835085920

PiperOrigin-RevId: 835096840

PiperOrigin-RevId: 835111311

PiperOrigin-RevId: 835111320

PiperOrigin-RevId: 835118587

PiperOrigin-RevId: 837297785

PiperOrigin-RevId: 837301178

…ABSL versions internally and in OSS. PiperOrigin-RevId: 837314334

…already writes the chosen node into `ready_chosen` using `memcpy`, so we don't need the conditionals in the print statement. Also, we need to save the "unchosen" node's information for printing purposes. PiperOrigin-RevId: 837321312

To avoid bringing GPU dependencies with default FFI target, split GPU-specific context decoding into backends/gpu:ffi PiperOrigin-RevId: 837342185

This is expected to be a safe change. Calls to multimap::find() historically returned the first matching element, or end() if there is no match. However, this is not guaranteed, and recent changes in libc++ changed this to return an arbitrary element that matches. Using equal_range() is a safe replacement that will preserve the current behavior. PiperOrigin-RevId: 837348099

… id mapping PiperOrigin-RevId: 837366466

PiperOrigin-RevId: 837373785

PiperOrigin-RevId: 837386345

…s one was missed in previous consolidations. PiperOrigin-RevId: 837391515

PiperOrigin-RevId: 837393069

…tx backend Instead of relying on is_autotuning_compilation boolean it's now up to backend runner to properly set --fail_ptx_compilation_on_register_spilling based on --xla_gpu_filter_kernels_spilling_registers_on_autotuning. Note the change in AutotunerCompileUtil::Compile calls GpuCodegenBackend::AdjustDebugOptionsForAutotuning. That seems to also improve compile time of benchmarks. Dropped gemm fusion autotuner tests about spilling as they basically tested if the backend respects debug option flags. With this change they would become tautological at best and don't exercise any behavior of gemm_fusion_autotuner. PiperOrigin-RevId: 837403708

…nalysis. PiperOrigin-RevId: 837407555

Prior to this change we would hit an assert when constructing the tensor::CollapseShapeOp in the 0D->0D case. PiperOrigin-RevId: 837419131

Can be enabled with XLA_FLAGS="--xla_backend_extra_options=xla_cpu_enable_tiled_emitter" (!warning! may not work as expected for now) Reverts 3a4bd44 PiperOrigin-RevId: 837419373

…er base class. PiperOrigin-RevId: 837424830

PiperOrigin-RevId: 837425105

PiperOrigin-RevId: 837425159

…uctions to tiled emitter. PiperOrigin-RevId: 837432186

PiperOrigin-RevId: 837446324

flag Corresponding field is already removed from the proto PiperOrigin-RevId: 837451384

PiperOrigin-RevId: 837453532

To get the behavior in line with internal ASSERT_OK_AND_ASSIGN. Unlike the non-TF_ counterpart, the macro expands to code that already includes the trailing semicolon so it happens to work. Attempts to replace it with an internal variant as part of b/444419873 make it not work anymore, until the missing semicolons are added. PiperOrigin-RevId: 837461910

Some arguments to kernels may not be managed by the buffer assignments and consequently have no buffer slices attached to them. Examples of these include scalars and arguments whose memory is managed by the runtime thunks [CollectiveKernelThunk]. This change introduces a new type for arguments to be passed in with a shape but without an associated slice. PiperOrigin-RevId: 837470085

This change replaces usages of tsl::errors::AlreadyExists with absl::AlreadyExistsError, wrapping arguments in absl::StrCat where necessary. This addresses deprecation warnings and moves towards standard Abseil error handling. Changes: - Replaced errors::AlreadyExists with absl::AlreadyExistsError. - Used absl::StrCat to construct error messages where necessary. Reverts 2fc3b48 PiperOrigin-RevId: 837473556

This change replaces usages of tsl::errors::ResourceExhausted with absl::ResourceExhaustedError, wrapping arguments in absl::StrCat where necessary. This addresses deprecation warnings and moves towards standard Abseil error handling. Changes: - Replaced errors::ResourceExhausted with absl::ResourceExhaustedError. - Used absl::StrCat to construct error messages where necessary. - Fixed missing dependencies in BUILD files using build_cleaner. - Reordered includes in windows_file_system.cc. PiperOrigin-RevId: 837490121

It's currently a cc_library target which breaks the layering_check which I'm trying to enable. Therefore this change introduces a new Bazel function `mkl_dep` which returns a select which resolves to a single target. This is based on the function `mkl_deps` which returns a select that resolves to a list of target. But this list has always one element or less, which is why it's easy to make this an alias. PiperOrigin-RevId: 837493107

Imported from GitHub PR openxla/xla#34467 📝 Summary of Changes AMD/ROCm Triton backend does not support warp specialization. `ThreadDims` are therefore calculated from module attributes and not retrieved from `nvvm.reqntid`. 🎯 Justification As warp specialization is not currently supported by the AMD/ROCm Triton backend, this backend ignores the `nvvm.reqntid` attribute. Therefore, this attribute does not contain a correct value, as the currently Triton implementation assumes the number of threads per warp is always 32, which is not the case for some AMD targets (see https://github.com/triton-lang/triton/blob/49e174c6856aed1d36b85fb2b398ffaa32a80aa8/lib/Conversion/TritonGPUToLLVM/FuncOpToLLVM.cpp#L204C53-L204C68). Consequently, the `ExtractThreadDims` has been adapted to calculate `ThreadDims` only based on attributes used and updated by the AMD triton backend. 🚀 Kind of Contribution Please remove what does not apply: 🐛 Bug Fix 📊 Benchmark (for Performance Improvements) Not relevant 🧪 Unit Tests: Fixes failures of type: ``` xla/backends/gpu/codegen/triton/fusion_emitter_device_test.cc:4234: Failure Value of: RunAndCompareNoHloPasses(kHloText, ErrorSpecForDotAlgorithm(algorithm)) Actual: false (INTERNAL: Expected total threads as per reqntid attribute to be 32 but got 64 as per ttg.total-num-warps and tt.threads-per-warp attributes.) Expected: true ``` for Triton Tests when targeting AMD GPUs. 🧪 Execution Tests: Not relevant Copybara import of the project: -- 24086e4e80223cdccd38c82af46f5bde96124b5a by Maxime France-Pillois <mfrancep@amd.com>: [ROCm] Fix ExtractThreadDims for AMD targets AMD/ROCm Triton backend does not support warp specialization. ThreadDims are therefore calculated from the Module attributes and not retrieved from `nvvm.reqntid`. Merging this change closes tensorflow#34467 PiperOrigin-RevId: 837500152

…o_tensor(...) to avoid .numpy() in graph/XLA mode

tensorflower-gardener and others added 30 commits November 20, 2025 03:12

fix typo in tuple_simplifier.

3c3f13c

PiperOrigin-RevId: 834674902

[XLA:CPU][XTile] Expand single element vector ops before lowering to …

71be9c8

…intrinsics. PiperOrigin-RevId: 834684558

Automated Code Change

059fde8

PiperOrigin-RevId: 834691273

[XLA:SPMD] Fix on entry input/output layout changing check.

4343c95

PiperOrigin-RevId: 834762307

[XLA] Another small refactor in call splitter.

8cbf93f

PiperOrigin-RevId: 834776465

[CI] Stop optional services in Windows CI to reduce file permission e…

f08d3a4

…rror flakes. They may be causing permission errors on Windows when Bazel tries to access header files in Windows SDK/Clang while building @@bazel_tools targets. PiperOrigin-RevId: 834792776

Populate input and output names of placeholder Signature

e9c6172

In LiteRT, input and output names shouldn't be empty. Populate default names if tensors don't have names. PiperOrigin-RevId: 834808503

Minor check for code integrity.

2ec7037

PiperOrigin-RevId: 834837889

[xla:cpu] Remove unused runtime_lightweight_check

e528edf

PiperOrigin-RevId: 834839651

Port ConditionalThunk/WhileThunk to provide shape for BufferUse

64d1888

PiperOrigin-RevId: 834846503

Integrate LLVM at llvm/llvm-project@355e0f94af5a

fe3a149

Updates LLVM usage to match [355e0f94af5a](llvm/llvm-project@355e0f94af5a) PiperOrigin-RevId: 834865231

Remove unused deprecated absl::testing usages in XLA GPU backend / se…

ecbfaa8

…rvice New namespace is absl_testing:: PiperOrigin-RevId: 834872277

[XLA] Only split the body of a call once.

cbe4456

When the call splitter is called on a non-flat graph, we don't want to implicitly flatten it by creating new bodies at each callsite. PiperOrigin-RevId: 834875376

Support sink VarHandleOp in tf.While.

9a31dea

PiperOrigin-RevId: 834896529

[XLA:Python] Release the GIL while starting or stopping the profiler.

9fdfc47

PiperOrigin-RevId: 834972320

Always assign new ID to AsyncHandle

4e54b82

PiperOrigin-RevId: 834982495

Fix shared TableConfig processing in compute_sparse_core_stats.

8a18b1f

PiperOrigin-RevId: 834997967

[PJRT C API] Add on_done callback to CopyToRemoteDevice in the cross-…

d91e832

…host transfer extension. PiperOrigin-RevId: 835038444

Automated Code Change

47c9d2e

PiperOrigin-RevId: 835066344

Automated Code Change

8051a5c

PiperOrigin-RevId: 835085920

Automated Code Change

b9cae6b

PiperOrigin-RevId: 835096840

compat: Update forward compatibility horizon to 2025-11-21

ab439c9

PiperOrigin-RevId: 835111311

Update GraphDef version to 2418.

5d8b4f8

PiperOrigin-RevId: 835111320

Update XNNPACK version.

4c4c531

PiperOrigin-RevId: 835118587

Peter Gavin and others added 30 commits November 26, 2025 16:58

Internal build rule change

2bebe29

PiperOrigin-RevId: 837297785

[XLA:Python] Add nanobind binding for absl::Status

c97e474

PiperOrigin-RevId: 837301178

Don't check tensorflow.logging.log_if signatute because of different …

2944961

…ABSL versions internally and in OSS. PiperOrigin-RevId: 837314334

[xla:ffi] Move GPU FFI implementation to backends/gpu

fc6e11f

To avoid bringing GPU dependencies with default FFI target, split GPU-specific context decoding into backends/gpu:ffi PiperOrigin-RevId: 837342185

[xla:gpu] Switch to type safe LocalDeviceId in local to global device…

4ccf38a

… id mapping PiperOrigin-RevId: 837366466

Automated Code Change

88cedf1

PiperOrigin-RevId: 837373785

[XLA:GPU] Allow to extract settings from hlo config dump.

c6daf98

PiperOrigin-RevId: 837386345

[XLA:GPU/TMA] Centralize TMA enablement control in the autotuner. Thi…

77ba53c

…s one was missed in previous consolidations. PiperOrigin-RevId: 837391515

Automated Code Change

5f62e5a

PiperOrigin-RevId: 837393069

Use GetInPlaceInputOutputPairs from AliasInfo instead of HloDataflowA…

589e8d4

…nalysis. PiperOrigin-RevId: 837407555

Fold no-op reshape when converting to linalg.

193a253

Prior to this change we would hit an assert when constructing the tensor::CollapseShapeOp in the 0D->0D case. PiperOrigin-RevId: 837419131

[XLA:CPU][XTile] Add first experimental integration of tiled emitter.

f150d49

Can be enabled with XLA_FLAGS="--xla_backend_extra_options=xla_cpu_enable_tiled_emitter" (!warning! may not work as expected for now) Reverts 3a4bd44 PiperOrigin-RevId: 837419373

[XLA:GPU] Move LLVMIR emitters out of ThunkEmitter and remove IREmitt…

df423f8

…er base class. PiperOrigin-RevId: 837424830

Update GraphDef version to 2424.

3cb3083

PiperOrigin-RevId: 837425105

compat: Update forward compatibility horizon to 2025-11-27

46ea2d7

PiperOrigin-RevId: 837425159

[XLA:CPU/GPU][XTile] Add missing RemSIOp & IsFiniteOp elemental instr…

06de5a6

…uctions to tiled emitter. PiperOrigin-RevId: 837432186

[XLA:GPU] Add a tool to optimize llvm::Module and compile to PTX.

f7446a1

PiperOrigin-RevId: 837446324

[XLA:GPU] remove xla_gpu_unsupported_generic_triton_emitter_features

afd8055

flag Corresponding field is already removed from the proto PiperOrigin-RevId: 837451384

[XLA:CPU][XTile] Add pass to make integer division / remainder safe.

f1b2045

PiperOrigin-RevId: 837453532

Replace tf.constant(...) in __init__ examples/tests with tf.convert_t…

fa133fe

…o_tensor(...) to avoid .numpy() in graph/XLA mode

Fix indentation: keep self.c43 inside __init__

a23bfae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use convert_to_tensor inside init to avoid eager-only numpy() under XLA#2

Use convert_to_tensor inside init to avoid eager-only numpy() under XLA#2
CodersAcademy006 wants to merge 1822 commits intomasterfrom
clean/convert-constants-init

CodersAcademy006 commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

CodersAcademy006 commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants