Skip to content

Conversation

@marchioa
Copy link
Contributor

@marchioa marchioa commented Jan 10, 2026

Add integer MaxPool1D for Generic platform and RQSConv1D support for PULPOpen, with corresponding kernel tests.

Added

  • MaxPool1D fp32 kernel and template for Generic platform
  • Test for MaxPool1D fp32 and int8
  • RQSConv1 template and tiling contraints for PULP Platform
  • Test for RQSConv1D int8

Changed

  • renamed MaxPool2D test from MaxPool to MaxPool/Regular_2D for both Integer and FP32

Fixed

  • im2col buffer size in Conv1d template

PR Merge Checklist

  1. The PR is rebased on the latest devel commit and pointing to devel.
  2. Your PR reviewed and approved.
  3. All checks are passing.
  4. The CHANGELOG.md file has been updated.
  5. If the docker was modified, change back its link after review.

@marchioa marchioa changed the title Titancfi Support for MaxPool1D and RQSConv1D for PULPOpen Jan 19, 2026
@marchioa marchioa marked this pull request as ready for review January 19, 2026 07:59
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added 1D MaxPool support for integer and floating-point types; 1D tiling and mapping for conv/pool ops
    • Added parameterized RQS-style 1D convolution bindings
  • Bug Fixes

    • Fixed im2col transient buffer sizing for 1D convolutions
  • Chores

    • CI workflows updated to exercise specific MaxPool variants
    • Consolidated MaxPool kernel declarations and updated changelog

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds 1D MaxPool support across Generic and PULPOpen targets (kernels, templates, bindings, tiler, tile constraints, parser, and platform mapping) and updates CI workflows to reference the Regular_2D MaxPool variant for tests.

Changes

Cohort / File(s) Summary
CI Workflow Updates
\.github/workflows/ci-deeploy.yml`, `.github/workflows/ci-platform-cortexm.yml`, `.github/workflows/ci-platform-generic.yml`, `.github/workflows/ci-platform-mempool.yml`, `.github/workflows/ci-platform-siracusa-tiled.yml`, `.github/workflows/ci-platform-siracusa.yml``
Replaced Kernels/*/MaxPool test entries with Kernels/*/MaxPool/Regular_2D in CI matrices (test target string updates).
Changelog
\CHANGELOG.md``
Added entries documenting MaxPool1D and RQSConv1D additions and an im2col buffer-size fix.
Generic kernels & headers
\TargetLibraries/Generic/inc/kernel/MaxPool.h`, `TargetLibraries/Generic/src/MaxPool_fp32.c`, `TargetLibraries/Generic/src/MaxPool_s8.c`, removed `TargetLibraries/Generic/inc/kernel/MaxPool1d.h`, removed `TargetLibraries/Generic/src/MaxPool1D_fp32.c`, updated `TargetLibraries/Generic/inc/DeeployBasicMath.h``
Consolidated 1D declarations into MaxPool.h; added MaxPool1d_fp32_fp32 and MaxPool1d_s8_s8 implementations in existing sources; removed separate 1D header/source and dropped its include. Added input guards in 2D functions.
Generic bindings & templates
\Deeploy/Targets/Generic/Bindings.py`, `Deeploy/Targets/Generic/Templates/MaxPoolTemplate.py``
Added int8 1D MaxPool binding; renamed _MaxPool2DTemplate_MaxPoolTemplate and added reference1DTemplate for 1D codegen.
PULPOpen bindings & platform mapping
\Deeploy/Targets/PULPOpen/Bindings.py`, `Deeploy/Targets/PULPOpen/Platform.py``
Renamed template import to MaxPoolTemplate; added PULPMaxPool1DBindings; replaced single Conv1D binding with parameterized RQSConv1D bindings; extended platform mapping to include both 1D and 2D MaxPool mappers and updated tiling-ready bindings/imports.
PULPOpen templates
\Deeploy/Targets/PULPOpen/Templates/MaxPoolTemplate.py`, `Deeploy/Targets/PULPOpen/Templates/ConvTemplate.py``
Added PULPMaxPool1D_8_Template; changed 1D im2col transient buffer size calculation and removed dynamic pointwise/conv selection so conv variant is always emitted.
PULPOpen tiler & tile constraints
\Deeploy/Targets/PULPOpen/Tiler.py`, `Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py``
Introduced 1D tiling-ready bindings (e.g., PULPRQSConv1DTilingReadyBindings, PULPMaxPool1DTilingReadyBindings); added RQConv1DTileConstraint and extensive 1D tiling geometry/serialization logic.
Parser validation
\Deeploy/Targets/Generic/Parsers.py``
Added stride > 0 validation in MaxPoolParser.parseNode.

Sequence Diagram(s)

sequenceDiagram
  participant Parser as Parser\n(MaxPool1D/2D)
  participant Mapper as Mapper\n(NodeMapper)
  participant Tiler as Tiler\n(TileConstraint)
  participant CodeGen as Templates\n(Codegen)
  participant Kernel as Kernel\n(Target lib)

  Parser->>Mapper: produce operatorRepresentation (1D or 2D)
  Mapper->>Tiler: request tiling-ready bindings selection
  Tiler->>CodeGen: serialize tiling solution & provide variable replacements
  CodeGen->>Kernel: emit kernel call (MaxPool1d or MaxPool2d)
  Kernel-->>CodeGen: link to target implementation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Feature, Bug

Suggested reviewers

  • Victor-Jung
  • Xeratec
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main changes: adding MaxPool1D and RQSConv1D support for PULPOpen platform.
Description check ✅ Passed The PR description is directly related to the changeset, detailing added features, changes, and fixes with a structured breakdown matching the actual modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@Deeploy/Targets/PULPOpen/Templates/MaxPoolTemplate.py`:
- Around line 34-46: The template PULPMaxPool1D_8_Template/PULPMaxPoolTemplate
currently passes ONNX 1D pads into the HEIGHT pad slots of pulp_nn_maxpool (so
padding_top/padding_bottom get ${padding_y}/${padding_y_right}); swap the pad
ordering so height pads are zeros and width pads receive the ONNX values: set
padding_top=0, padding_bottom=0 and padding_left=${padding_y},
padding_right=${padding_y_right} in the pulp_nn_maxpool${signatureString} call
(i.e. move the two zeros before the two ${padding_y} args).

In `@Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py`:
- Around line 840-844: The effectiveInputLengthVar calculation is wrong: it
subtracts (inputLengthVar - 1) for the tiled case but should subtract the kernel
span (weightLengthVar - 1); update the expression in ConvTileConstraint (the
effectiveInputLengthVar assignment) to replace the final term ((inputLengthVar -
1) * (inputLengthVar != inputBuffer.shape[1])) with ((weightLengthVar - 1) *
(inputLengthVar != inputBuffer.shape[1])) so the receptive field overlap uses
weightLengthVar consistently with the 2D implementation.
- Around line 871-873: The effectiveInputLength calculation in
addPolicyConstraint uses (inputLengthVar - 1) where it should subtract the
kernel length minus one (same bug as in addGeometricalConstraint); update the
ternary expression so the last term uses the kernel size (e.g., kernelLength or
kernelLengthVar) minus one instead of inputLengthVar - 1, i.e., change "-
(inputLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1])" to "-
(kernelLength - 1) * (inputLengthVar != inputBuffer.shape[1])" in the
effectiveInputLength assignment inside addPolicyConstraint.

In `@TargetLibraries/Generic/src/MaxPool_s8.c`:
- Around line 52-69: The MaxPool1d_s8_s8 function currently computes L_out = (L
- K) / S + 1 which can underflow when K > L and divide-by-zero when S == 0;
before calculating L_out, add guards: if S == 0 return (or otherwise abort) and
if K > L set L_out = 0 (or return) to avoid the underflow/OOB behavior; update
the function entry (MaxPool1d_s8_s8) to validate S and the relationship of K and
L and bail out or produce zero-length output before any loops that use L_out.
🧹 Nitpick comments (5)
TargetLibraries/Generic/inc/kernel/MaxPool.h (1)

42-48: Consider aligning parameter naming between int8 and fp32 variants.

The MaxPool1d_s8_s8 function uses L for the spatial dimension while MaxPool1d_fp32_fp32 uses W. For consistency across the API, consider using the same parameter name (e.g., L for "Length" in 1D operations) in both declarations.

Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (4)

618-618: Remove unused variable dilation.

The dilation variable is assigned but never used in addGeometricalConstraint. If dilation support is intended for future use, consider adding a TODO comment; otherwise, remove it.

♻️ Suggested fix
-        dilation = parseDict["dilations"][0]
+        # Note: dilation not currently used in 1D constraint; add if needed

651-658: Remove unused variable pads in addPolicyConstraint.

The pads variable is assigned on line 657 but never used in this method.

♻️ Suggested fix
         inputBuffer = ctxt.lookup(name = parseDict['data_in'])
         weightBuffer = ctxt.lookup(name = parseDict['weight'])

-        pads = parseDict["pads"]
         stride = parseDict["strides"][0]

768-769: Consider adding strict=True to zip() calls.

Using zip() without strict=True may silently truncate if the input lists have different lengths, potentially masking bugs.

♻️ Suggested fix
-        for a, b, add, mul in zip(inputInCubes, inputWeightCubes, inputAddCubes, inputMulCubes):
+        for a, b, add, mul in zip(inputInCubes, inputWeightCubes, inputAddCubes, inputMulCubes, strict=True):

1046-1056: Consider adding strict=True to zip() calls.

Same recommendation as in RQConv1DTileConstraint for defensive programming.

♻️ Suggested fix
         if varBias == "NULL":
-            for a, b in zip(inputInCubes, inputWeightCubes):
+            for a, b in zip(inputInCubes, inputWeightCubes, strict=True):
                 inputLoadSchedule.append({"data_in": a, "weight": b})
         else:
-            for a, b, c in zip(inputInCubes, inputWeightCubes, inputBiasCubes):
+            for a, b, c in zip(inputInCubes, inputWeightCubes, inputBiasCubes, strict=True):
                 inputLoadSchedule.append({"data_in": a, "weight": b, "bias": c})

Comment on lines +840 to +844
effectiveInputLengthVar = inputLengthVar \
+ (sum(pads) * (inputLengthVar == inputBuffer.shape[1])) \
- ((inputLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1]))

_outputLengthVar = (effectiveInputLengthVar - dilation * (weightLengthVar - 1) - 1) // stride + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential bug in effectiveInputLengthVar formula.

The formula subtracts (inputLengthVar - 1) when tiling, but comparing with Conv2DTileConstraint.addGeometricalConstraint (lines 305-308), the 2D version subtracts (weightHeightVar - 1). This appears to be a copy-paste error—it should likely be (weightLengthVar - 1) to account for the kernel receptive field overlap.

🐛 Suggested fix
         #   Assume worst case scenario (data padding on all sides) when tiling on a ceratin dimension.
         effectiveInputLengthVar = inputLengthVar \
             + (sum(pads) * (inputLengthVar == inputBuffer.shape[1])) \
-            - ((inputLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1]))
+            - ((weightLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1]))
🤖 Prompt for AI Agents
In `@Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py` around lines
840 - 844, The effectiveInputLengthVar calculation is wrong: it subtracts
(inputLengthVar - 1) for the tiled case but should subtract the kernel span
(weightLengthVar - 1); update the expression in ConvTileConstraint (the
effectiveInputLengthVar assignment) to replace the final term ((inputLengthVar -
1) * (inputLengthVar != inputBuffer.shape[1])) with ((weightLengthVar - 1) *
(inputLengthVar != inputBuffer.shape[1])) so the receptive field overlap uses
weightLengthVar consistently with the 2D implementation.

Comment on lines +871 to +873
effectiveInputLength = inputLengthVar \
+ (sum(pads) * (inputLengthVar == inputBuffer.shape[1])) \
- (inputLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same formula issue in addPolicyConstraint.

This has the same issue as addGeometricalConstraint—it subtracts (inputLengthVar - 1) instead of the kernel length minus one.

🐛 Suggested fix
         effectiveInputLength = inputLengthVar \
             + (sum(pads) * (inputLengthVar == inputBuffer.shape[1])) \
-            - (inputLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1])
+            - (weightLengthVar - 1) * (inputLengthVar != inputBuffer.shape[1])
🤖 Prompt for AI Agents
In `@Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py` around lines
871 - 873, The effectiveInputLength calculation in addPolicyConstraint uses
(inputLengthVar - 1) where it should subtract the kernel length minus one (same
bug as in addGeometricalConstraint); update the ternary expression so the last
term uses the kernel size (e.g., kernelLength or kernelLengthVar) minus one
instead of inputLengthVar - 1, i.e., change "- (inputLengthVar - 1) *
(inputLengthVar != inputBuffer.shape[1])" to "- (kernelLength - 1) *
(inputLengthVar != inputBuffer.shape[1])" in the effectiveInputLength assignment
inside addPolicyConstraint.

Copy link
Member

@Victor-Jung Victor-Jung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciao Alex, I have a few minor comments but overall it looks good.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/Targets/Generic/Parsers.py (1)

193-200: Potential KeyError when strides attribute is missing.

The stride validation on line 199 accesses node.attrs['strides'] unconditionally. Since this is a list literal passed to all(), Python evaluates all elements before calling all() — no short-circuit evaluation occurs. If 'strides' not in node.attrs, this will raise a KeyError.

🐛 Proposed fix: move stride validation after the attribute existence check
 def parseNode(self, node: gs.Node) -> bool:

     ret = all([
         'ceil_mode' in node.attrs, 'kernel_shape' in node.attrs, 'pads' in node.attrs, 'strides' in node.attrs,
         len(node.inputs) == 1,
         len(node.outputs) >= 1,
-        all([stride > 0 for stride in node.attrs['strides']]),
     ])

+    if ret:
+        ret = all(stride > 0 for stride in node.attrs['strides'])
+
     if ret:
         self.operatorRepresentation['ceil_mode'] = node.attrs['ceil_mode']
🤖 Fix all issues with AI agents
In `@TargetLibraries/Generic/src/MaxPool_s8.c`:
- Around line 55-58: The code lacks a guard for K == 0 which allows a
zero-length kernel and produces invalid outputs; update the input validation in
MaxPool_s8 (the block checking L, K, S) to treat K == 0 as invalid—mirror the
existing stride guard (S == 0) by returning (or asserting) when K == 0 (e.g.,
add K == 0 to the condition that currently checks L < K || S == 0) so the
computation of L_out is never performed with a zero kernel.

@marchioa
Copy link
Contributor Author

I handled both conditions directly in the C kernel by early-returning when the parameters are invalid. I couldn’t find an existing pattern for this in other kernels, so this was a best-effort approach on my side — please let me know if there’s a preferred or more idiomatic way to handle it.

Should I apply the same kind of guard to MaxPool2d as well?

Additionally, since the stride value is available from ONNX, I added a guard in the parser to catch stride == 0.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/Targets/Generic/Parsers.py (1)

195-211: KeyError when strides attribute is missing.

The stride validation on line 202 accesses node.attrs['strides'] unconditionally. Since all([...]) evaluates all list elements before calling all(), this will raise a KeyError if the 'strides' attribute is not present—even though the check 'strides' in node.attrs exists earlier in the list.

🐛 Proposed fix: Move stride validation after initial checks
     def parseNode(self, node: gs.Node) -> bool:

         ret = all([
             'ceil_mode' in node.attrs,
             'kernel_shape' in node.attrs,
             'pads' in node.attrs,
             'strides' in node.attrs,
             len(node.inputs) == 1,
             len(node.outputs) >= 1,
-            all([stride > 0 for stride in node.attrs['strides']]),
         ])

-        if ret:
+        if ret and not all(stride > 0 for stride in node.attrs['strides']):
+            return False
+
+        if ret:
             self.operatorRepresentation['ceil_mode'] = node.attrs['ceil_mode']
             self.operatorRepresentation['pads'] = node.attrs['pads']
             self.operatorRepresentation['kernel_shape'] = node.attrs['kernel_shape']
             self.operatorRepresentation['strides'] = node.attrs['strides']

         return ret
🤖 Fix all issues with AI agents
In `@TargetLibraries/Generic/src/MaxPool_fp32.c`:
- Around line 46-50: The guard that validates input parameters in MaxPool_fp32.c
should also treat a kernel size of zero as invalid; update the condition
currently checking "if (W < K || S == 0)" to include "K == 0" so the function
returns early for empty kernels (this affects the validation in the function
containing the W_out calculation and the loop "for (uint32_t c = 0; c < C;
++c)"). Ensure the check prevents proceeding when K == 0 to avoid leaving max
initialized at -inf and producing invalid outputs.
♻️ Duplicate comments (1)
TargetLibraries/Generic/src/MaxPool_s8.c (1)

55-58: Still missing guard for K == 0.

As flagged in a previous review, a zero-length kernel bypasses the current guard, causes the inner loop to never execute, and writes -128 + output_offset to all output positions. Add K == 0 to the validation.

Proposed fix
-  if (L < K || S == 0) {
+  if (K == 0 || S == 0 || L < K) {
     return;
   }

@Victor-Jung
Copy link
Member

I handled both conditions directly in the C kernel by early-returning when the parameters are invalid. I couldn’t find an existing pattern for this in other kernels, so this was a best-effort approach on my side — please let me know if there’s a preferred or more idiomatic way to handle it.

Should I apply the same kind of guard to MaxPool2d as well?

Additionally, since the stride value is available from ONNX, I added a guard in the parser to catch stride == 0.

Sorry, I missed the guard statement 😁 Sure, if you can add the guard to the other maxpool kernels, it would be amazing! Thanks for the contribution. I am approving this PR, and we can merge it soon.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@Deeploy/Targets/Generic/Parsers.py`:
- Around line 195-203: The current boolean expression assigned to ret eagerly
accesses node.attrs['strides'] and can raise KeyError; modify the condition so
the positivity check runs only when 'strides' is present (e.g., replace the
inner all([stride > 0 for stride in node.attrs['strides']]) with a
short-circuiting check like ('strides' in node.attrs and all(stride > 0 for
stride in node.attrs['strides'])) or restructure the overall predicate to use
generator expressions that only evaluate node.attrs['strides'] after confirming
'strides' in node.attrs); update the expression that sets ret (and reference
node.attrs, 'strides', and the ret assignment) accordingly so missing strides
yields False without an exception.

In `@TargetLibraries/Generic/src/MaxPool_fp32.c`:
- Around line 14-16: The early-return in MaxPool_fp32.c currently checks if (H <
P || W < Q || SP == 0 || SQ == 0) but misses guarding P == 0 || Q == 0, which
allows empty pooling windows to leave max as -inf; update the condition in the
MaxPool_fp32 function to also return when P == 0 or Q == 0 (i.e., include P == 0
|| Q == 0 alongside the existing H, W, SP, SQ checks) so no invalid -inf outputs
are written.
🧹 Nitpick comments (1)
TargetLibraries/Generic/src/MaxPool_s8.c (1)

13-15: Add guards for P == 0 and Q == 0.

For consistency with the 1D variant feedback, a zero kernel size (P == 0 or Q == 0) would cause the inner loops to never execute, leaving max = -128 and producing semantically incorrect output. Consider guarding these cases as well.

♻️ Proposed fix
-  if (H < P || W < Q || SP == 0 || SQ == 0) {
+  if (P == 0 || Q == 0 || SP == 0 || SQ == 0 || H < P || W < Q) {
     return;
   }

Comment on lines 195 to 203
ret = all([
'ceil_mode' in node.attrs, 'kernel_shape' in node.attrs, 'pads' in node.attrs, 'strides' in node.attrs,
'ceil_mode' in node.attrs,
'kernel_shape' in node.attrs,
'pads' in node.attrs,
'strides' in node.attrs,
len(node.inputs) == 1,
len(node.outputs) >= 1
len(node.outputs) >= 1,
all([stride > 0 for stride in node.attrs['strides']]),
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid KeyError when strides is absent (Line 195–203).
all([...]) eagerly evaluates the list, so node.attrs['strides'] will throw if the attribute is missing; previously this would just return False. Guard the access and run the positivity check only when strides is present.

🛠️ Proposed fix
-        ret = all([
-            'ceil_mode' in node.attrs,
-            'kernel_shape' in node.attrs,
-            'pads' in node.attrs,
-            'strides' in node.attrs,
-            len(node.inputs) == 1,
-            len(node.outputs) >= 1,
-            all([stride > 0 for stride in node.attrs['strides']]),
-        ])
+        strides = node.attrs.get('strides')
+        ret = all([
+            'ceil_mode' in node.attrs,
+            'kernel_shape' in node.attrs,
+            'pads' in node.attrs,
+            strides is not None,
+            len(node.inputs) == 1,
+            len(node.outputs) >= 1,
+        ])
+        if ret and not all(stride > 0 for stride in strides):
+            return False
🤖 Prompt for AI Agents
In `@Deeploy/Targets/Generic/Parsers.py` around lines 195 - 203, The current
boolean expression assigned to ret eagerly accesses node.attrs['strides'] and
can raise KeyError; modify the condition so the positivity check runs only when
'strides' is present (e.g., replace the inner all([stride > 0 for stride in
node.attrs['strides']]) with a short-circuiting check like ('strides' in
node.attrs and all(stride > 0 for stride in node.attrs['strides'])) or
restructure the overall predicate to use generator expressions that only
evaluate node.attrs['strides'] after confirming 'strides' in node.attrs); update
the expression that sets ret (and reference node.attrs, 'strides', and the ret
assignment) accordingly so missing strides yields False without an exception.

Comment on lines +14 to +16
if (H < P || W < Q || SP == 0 || SQ == 0) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard P == 0 || Q == 0 to avoid producing -inf outputs.

Consistent with the 1D guard feedback and the existing checks, if either pool dimension is zero the kernel window is empty and max remains -inf, writing invalid values. Add P == 0 || Q == 0 to the early-return condition.

🩹 Proposed fix
-  if (H < P || W < Q || SP == 0 || SQ == 0) {
+  if (H < P || W < Q || P == 0 || Q == 0 || SP == 0 || SQ == 0) {
     return;
   }
🤖 Prompt for AI Agents
In `@TargetLibraries/Generic/src/MaxPool_fp32.c` around lines 14 - 16, The
early-return in MaxPool_fp32.c currently checks if (H < P || W < Q || SP == 0 ||
SQ == 0) but misses guarding P == 0 || Q == 0, which allows empty pooling
windows to leave max as -inf; update the condition in the MaxPool_fp32 function
to also return when P == 0 or Q == 0 (i.e., include P == 0 || Q == 0 alongside
the existing H, W, SP, SQ checks) so no invalid -inf outputs are written.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants