Skip to content

Conversation

@biplavbarua
Copy link

fix(importer): sanitize HTML anchor tags from vulnerability details

Description

Fixes #4237.

This PR implements sanitization for summary and details fields in vulnerability records. It uses a regex-based approach to strip HTML <a> tags while preserving the link text. This addresses the issue where some NuGet records (and potentially others) contain raw HTML anchor tags that are not rendered correctly in downstream consumers.

Changes

  • Added _sanitize_string helper function in osv/sources.py.
  • Applied sanitization in parse_vulnerability_from_dict.
  • Verified with valid and invalid inputs locally.

Verification

  • Created a reproduction test case with a mock NuGet record containing anchor tags.
  • Verified that the tags were present before the fix.
  • Verified that the tags were stripped (and text preserved) after the fix.

@ashmod
Copy link
Contributor

ashmod commented Dec 25, 2025

This should be fixed in #4431. The issue was probably just not autoclosed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some NuGet records contain anchor tags

3 participants