Skip to content

Make ParseDict call sites compatible with types-protobuf v7 stubs#638

Draft
kmontemayor2-sc wants to merge 4 commits into
mainfrom
kmonte/types-protobuf-v7-compat
Draft

Make ParseDict call sites compatible with types-protobuf v7 stubs#638
kmontemayor2-sc wants to merge 4 commits into
mainfrom
kmonte/types-protobuf-v7-compat

Conversation

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator

Summary

  • types-protobuf v7 tightened google.protobuf.json_format.ParseDict's js_dict parameter from an Any-compatible alias to a strict dict[str, Any].
  • The two ParseDict call sites in this repo pass the result of OmegaConf.to_object(...) / OmegaConf.to_container(...), which mypy infers as a broad union (dict[str | bytes | int | Enum | float | bool, Any] | list[Any] | str | Any | None). With v7 stubs, both fail mypy.
  • At each call site, narrow with an explicit isinstance(..., dict) guard that raises TypeError (per the "Fail fast on invalid state" coding standard in CLAUDE.md), then cast(dict[str, Any], ...) before handing the value to ParseDict. The isinstance narrowing alone is insufficient because
    dict is invariant in its key type — the cast refines the key type from the broader union down to str.
  • A grep -rn "ParseDict(" gigl snapchat --include="*.py" | grep -v _pb2.py returns exactly the two call sites fixed here, so the surface area is fully enumerated.

kmontemayor added 3 commits May 13, 2026 15:40
types-protobuf v7 tightened ParseDict's js_dict parameter from an
Any-compatible alias to dict[str, Any]. Validate that the YAML root
is a mapping and cast before passing to ParseDict so mypy is clean
under the new stubs.
types-protobuf v7 tightened ParseDict's js_dict parameter. Validate
that dataset.metadata resolves to a mapping and cast before handing
it to ParseDict so mypy is clean under the new stubs.
@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

/all_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:02UTC : 🔄 C++ Unit Test started.

@ 16:17:07UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:03UTC : 🔄 Lint Test started.

@ 16:21:30UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:08UTC : 🔄 E2E Test started.

@ 16:21:33UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:09UTC : 🔄 Python Unit Test started.

@ 17:24:55UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:10UTC : 🔄 Integration Test started.

@ 17:37:09UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 16:17:11UTC : 🔄 Scala Unit Test started.

@ 16:21:53UTC : ❌ Workflow failed.
Please check the logs for more details.

Comment on lines 79 to 82
pb = ParseDict(
js_dict=graph_metadata_dict, message=graph_schema_pb2.GraphMetadata()
js_dict=cast(dict[str, Any], graph_metadata_dict),
message=graph_schema_pb2.GraphMetadata(),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity and reachable issue identified in your code:
Line 79 has a vulnerable usage of protobuf, introducing a high severity vulnerability.

ℹ️ Why this is reachable

A reachable issue is a real security risk because your project actually executes the vulnerable code. This issue is reachable because your code uses a certain version of protobuf.
Affected versions of protobuf are vulnerable to Uncontrolled Recursion. A denial-of-service vulnerability in the Python protobuf library's JSON parser allows deeply nested google.protobuf.Any messages to bypass the configured max_recursion_depth in json_format.ParseDict. Because the internal Any-handling logic does not update the recursion counter, an attacker supplying a JSON payload with repeatedly nested Any messages can exhaust Python's recursion stack (raising RecursionError) instead of a controlled ParseError, potentially crashing or disrupting services that parse untrusted JSON.

References: GHSA, CVE

To resolve this comment:
Upgrade this dependency to at least version 5.29.6 at uv.lock.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

f"ProtoUtils.read_proto_from_yaml expected a mapping at the YAML root for "
f"{uri}, got {type(obj_dict).__name__}."
)
proto = ParseDict(js_dict=cast(dict[str, Any], obj_dict), message=proto_cls())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity and reachable issue identified in your code:
Line 36 has a vulnerable usage of protobuf, introducing a high severity vulnerability.

ℹ️ Why this is reachable

A reachable issue is a real security risk because your project actually executes the vulnerable code. This issue is reachable because your code uses a certain version of protobuf.
Affected versions of protobuf are vulnerable to Uncontrolled Recursion. A denial-of-service vulnerability in the Python protobuf library's JSON parser allows deeply nested google.protobuf.Any messages to bypass the configured max_recursion_depth in json_format.ParseDict. Because the internal Any-handling logic does not update the recursion counter, an attacker supplying a JSON payload with repeatedly nested Any messages can exhaust Python's recursion stack (raising RecursionError) instead of a controlled ParseError, potentially crashing or disrupting services that parse untrusted JSON.

References: GHSA, CVE

To resolve this comment:
Upgrade this dependency to at least version 5.29.6 at uv.lock.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmontemayor2-sc -worth upgrading?

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

/all_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 17:43:04UTC : 🔄 Scala Unit Test started.

@ 17:53:17UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 17:43:10UTC : 🔄 Python Unit Test started.

@ 18:53:11UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 17:43:11UTC : 🔄 E2E Test started.

@ 19:12:34UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 17:43:15UTC : 🔄 C++ Unit Test started.

@ 17:45:45UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

GiGL Automation

@ 17:43:15UTC : 🔄 Integration Test started.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

GiGL Automation

@ 17:43:17UTC : 🔄 Lint Test started.

@ 17:52:20UTC : ✅ Workflow completed successfully.

Copy link
Copy Markdown
Collaborator

@svij-sc svij-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple qqs

f"ProtoUtils.read_proto_from_yaml expected a mapping at the YAML root for "
f"{uri}, got {type(obj_dict).__name__}."
)
proto = ParseDict(js_dict=cast(dict[str, Any], obj_dict), message=proto_cls())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmontemayor2-sc -worth upgrading?

f"ProtoUtils.read_proto_from_yaml expected a mapping at the YAML root for "
f"{uri}, got {type(obj_dict).__name__}."
)
proto = ParseDict(js_dict=cast(dict[str, Any], obj_dict), message=proto_cls())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are proto yaml representations guaranteed to be strings?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the keys? I'm pretty sure but it's not a big deal here either way the important part is to ensure it's a dict.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right we are casting to dict[str, Any] should it just be cast(dict) ?

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

@kmontemayor2-sc -worth upgrading?

Is what worth upgrading?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants