Skip to content

Merge of V0.2 to integ branch#79

Merged
eaglei15 merged 23 commits into
integfrom
v0.2
May 11, 2026
Merged

Merge of V0.2 to integ branch#79
eaglei15 merged 23 commits into
integfrom
v0.2

Conversation

@eaglei15
Copy link
Copy Markdown
Collaborator

No description provided.

saquibsaifee and others added 23 commits March 1, 2026 00:57
…hon-lib

Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…se-cyclonedx-python-lib

refactor: generate CycloneDX BOMs using cyclonedx-python-lib
…-bom-generation-to-use-cyclonedx-python-lib

Revert "refactor: generate CycloneDX BOMs using cyclonedx-python-lib"
Extract hyperparameters from safetensors repos by combining config.json
(using llama.cpp's find_hparam key fallback chains), tokenizer_config.json,
and safetensors tensor headers. Safetensors takes precedence over GGUF as
the original source format.

- Add config_parsing.py as canonical home for HPARAM_KEYS and parse_config()
- Add safetensors_metadata.py with SafetensorsModelInfo, map_to_metadata(),
  fetch_safetensors_metadata() (config.json + tokenizer + tensor headers)
- Add SafetensorsFileExtractor to model_file_extractors.py
- Wire hyperparameter extraction into EnhancedExtractor via _try_config_extraction
- Add safetensors>=0.4.0 as runtime dependency
- 83 tests covering config parsing, metadata mapping, tensor extraction,
  HF Hub integration, extractor wiring, precedence, and fixture end-to-end
Replace all print() calls across CLI and web entry points with Python's
logging module using module-level loggers and lazy % formatting, as
recommended in CONTRIBUTING.md. Configure logging.basicConfig in CLI
entrypoint with --verbose flag setting DEBUG level. Also fix duplicate
import of calculate_completeness_score and correct --verbose to set
DEBUG (was INFO).

Closes #16

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-lib JsonValidator

Remove the hand-rolled schema downloader and referencing.Registry bootstrap.
Delegate CycloneDX 1.6 validation entirely to JsonValidator(SchemaVersion.V1_6)
from cyclonedx-python-lib, which bundles the official SNAPSHOT schemas.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…-python-lib

Remove dead code: validate_spdx(), _validate_ai_requirements(), SPDX_LICENSES
loading block, and JSON_SCHEMA_REGISTRY bootstrap. Replace the hand-rolled
validator in validate_aibom() with JsonValidator(SchemaVersion.V1_6).
Remove unused os import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…x.spdx

- is_valid_spdx_license_id() now delegates to cyclonedx.spdx.is_supported_id()
- normalize_license_id() uses cyclonedx.spdx.fixup_id() for case normalisation
  ("apache-2.0" -> "Apache-2.0"), falling back to LICENSE_MAPPING for multi-word
  aliases that fixup_id cannot handle
- Consolidate LICENSE_MAPPING with entries previously duplicated in extractor.py
  (BSD 2/3-clause, GPL v2/v3 aliases, Apache variations)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…te_purl typo

- Replace deprecated Tool class with ToolRepository(components=[...]) in
  _build_cyclonedx_metadata and _create_minimal_aibom
- Fix AttributeError: _create_minimal_aibom called _generate_hf_purl which
  did not exist; corrected to _generate_purl
- Replace manual DisjunctiveLicense construction in _process_licenses with
  cyclonedx.contrib.license.factories.LicenseFactory.make_from_string(),
  which auto-detects SPDX simple IDs, compound expressions, and custom names
- Remove redundant post-serialisation description workaround (library preserves
  description natively)
- Add comment explaining manual modelCard injection (library TODO since CDX1.5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…_utils

Remove the LICENSE_MAPPINGS class dict (12 entries duplicating license_utils.py).
_detect_license_from_files() now iterates over LICENSE_MAPPING imported from
license_utils, making the consolidated mapping the single source of truth.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…ssion deps

Schemas (bom-1.6, bom-1.7, spdx) are now bundled inside cyclonedx-python-lib
and accessed via SchemaVersion — no local copies needed.

Remove from dependencies:
- jsonschema (replaced by cyclonedx-python-lib JsonValidator)
- license-expression (replaced by cyclonedx.spdx)
- nltk, python-dateutil, python-dotenv (unused in src/)

Add cyclonedx-python-lib[json-validation] extra to declare the JsonValidator
optional dependency explicitly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…usage

- README: add callout noting BOM generation, validation, and SPDX license
  handling are powered by cyclonedx-python-lib
- CONTRIBUTING: add CycloneDX Python Library key concept with link; remove
  schemas/ from architecture tree; update import example to use cyclonedx
  instead of requests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…hon-lib

Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…solution

The v0.2 merge left _create_aibom_structure with a duplicated aibom assignment:
the v0.2 manual JSON dict build followed immediately by the orphaned library
path, referencing undefined variables (bom, component_section). Restored the
correct cyclonedx-python-lib based implementation.

Also fix test_generate_aibom_version_truncation: dependencies[0]["ref"] is the
metadata component PURL (the scanner), not the model — only dependsOn contains
the model's versioned PURL.

Restore missing ExternalReferenceType import in test_service.py dropped by merge.
Restore dev dependencies in uv.lock dropped by v0.2 merge (pytest, pytest-cov, etc).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
Both are present in pyproject.toml but were missing from requirements.txt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
feat: safetensors hyperparameter extraction with GGUF parity
refactor(logging): replace print statements with structured logging
Merges commit b988d41 — adds safetensors metadata reading and
config_parsing module for structured hyperparameter extraction.

Resolved conflict in extractor.py by keeping both LICENSE_MAPPING
import and the new config_parsing import; removed duplicate __init__
introduced by prior bad merges; added missing `import json`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Saquib Saifee <saquibsaifee2@gmail.com>
…se-cyclonedx-python-lib

refactor: use cyclonedx-python-lib for BOM generation, validation, and license handling
@eaglei15 eaglei15 merged commit c27768d into integ May 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants