Skip to content

feat(init,ci): preserve .specify by default and harden ai-review#140

Merged
nsalvacao merged 10 commits intomainfrom
feat/72-smart-specify-detection-main
Feb 19, 2026
Merged

feat(init,ci): preserve .specify by default and harden ai-review#140
nsalvacao merged 10 commits intomainfrom
feat/72-smart-specify-detection-main

Conversation

@nsalvacao
Copy link
Copy Markdown
Owner

@nsalvacao nsalvacao commented Feb 19, 2026

Summary

  • preserve existing .specify/ content by default when running specify init in an existing project
  • add symlink-aware detection and block unsafe --force overwrite for symlinked .specify
  • keep explicit --force path for intentional reinitialization
  • keep and harden .github/workflows/ai-review.yml for large diffs:
    • deterministic review.md creation in full-diff mode before posting PR comments
    • append-only UTC timeline logging (review_timeline.md) for model attempts, HTTP status, retries, rate-limit headers, and chunk progress
    • timeline export to workflow run summary for easier incident/debug analysis
  • align CHANGELOG.md with the actual PR scope (init + ci)

Credit

Why this PR

This supersedes the conflicted intake lane in #72 with a clean branch based on main, so main remains the source of truth and CI/workflow behavior is consistent.

Validation

  • uv run pytest tests/test_specify_detection.py --tb=short
  • uv run pytest tests/ --tb=short
  • npx markdownlint-cli2 README.md CHANGELOG.md
  • python3 - <<'PY' ... yaml.safe_load('.github/workflows/ai-review.yml') ... PY
  • uv run specify check

Adds smart detection for existing .specify state during init in current directory, preserves by default, and hardens symlink handling for --force.
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @nsalvacao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the specify init command to intelligently manage existing .specify directories. It introduces a default behavior to preserve existing project configurations, preventing unintended overwrites. Additionally, it adds robust detection for symlinked .specify directories, safeguarding against unsafe --force operations that could modify external paths. These changes improve the command's safety and user experience, particularly when reinitializing projects.

Highlights

  • Default Preservation of .specify: The specify init command now preserves existing .specify/ content by default when re-running in an existing project, preventing accidental data loss.
  • Symlink-Aware Detection and Protection: Implemented symlink-aware detection for .specify directories, which now blocks the --force option from overwriting symlinked .specify directories to prevent unsafe modifications outside the project path.
  • Explicit --force for Reinitialization: The --force flag retains its functionality for intentional reinitialization and overwriting of .specify content, but with the added symlink protection.
  • Documentation and Version Update: The README.md and CHANGELOG.md have been updated to reflect these changes, and the project version has been bumped to 0.0.36.
  • Regression Tests: New regression tests have been added to ensure the correct behavior of .specify state detection, including handling of symlinks.
Changelog
  • CHANGELOG.md
    • Upstream Add smart detection to preserve .specify/ when adding agents github/spec-kit#1506 hardening: smart .specify detection now preserves existing project state by default when re-running specify init in current directory.
    • Added safe .specify detection helper (detect_existing_specify_state) with symlink-aware behavior.
    • Added explicit guard: --force refuses to overwrite symlinked .specify directories.
    • Added regression tests: tests/test_specify_detection.py.
Activity
  • Validated changes by running uv run pytest tests/test_specify_detection.py --tb=short.
  • Executed the full test suite with uv run pytest tests/ --tb=short.
  • Ensured documentation consistency using npx markdownlint-cli2 README.md CHANGELOG.md.
  • Performed a uv run specify check to verify overall functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Enhance specify init to smartly detect and preserve existing .specify project state by default, adding safeguards against overwriting symlinked .specify directories. Introduce a --force flag to explicitly overwrite existing .specify content.

Changes

  • Added detect_existing_specify_state helper for .specify detection with symlink awareness.
  • Updated init command to preserve .specify directory by default when detected.
  • Added explicit refusal to overwrite symlinked .specify even with --force.
  • Updated CLI help and README to document new .specify preservation behavior and --force usage.
  • Added regression tests for .specify detection and preservation logic.
  • Refactored GitHub Actions workflow chunking script for robustness.
  • Bumped version to 0.0.36.

Impact

  • specify init command behavior when run in existing projects is safer and non-destructive by default.
  • Users must use --force to overwrite .specify content, preventing accidental data loss.
  • CI workflow chunking logic improved for error handling and clarity.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces significant improvements to the specify init command, enhancing its behavior when re-initializing projects. It now preserves existing .specify/ content by default, adds symlink-aware detection to prevent accidental overwrites, and refines the --force option for intentional reinitialization. The changes also include updating documentation, bumping the version, and adding regression tests for the new detection logic. The code is well-structured and addresses a critical usability and safety concern for existing projects.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements smart detection to preserve existing .specify/ project state when running specify init in a directory that already has a spec-kit project. The implementation adds safeguards against accidental overwrites while providing an explicit --force option for intentional reinitialization. This change supersedes PR #72 (intake lane) with a clean implementation based on main.

Changes:

  • Added detect_existing_specify_state() helper with symlink-aware detection
  • Modified init command to preserve .specify/ by default when it contains existing content
  • Added explicit safety guard that refuses --force overwrite of symlinked .specify directories
  • Updated CLI parameter descriptions and user-facing messages
  • Added comprehensive unit tests for detection logic

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/specify_cli/__init__.py Added detect_existing_specify_state() helper, updated download_and_extract_template() signature, and integrated smart detection logic into init() command with clear user messaging
tests/test_specify_detection.py Added comprehensive unit tests covering missing paths, empty directories, non-empty directories, and symlinks
pyproject.toml Bumped version from 0.0.35 to 0.0.36
uv.lock Updated version lock to 0.0.36
README.md Updated documentation with examples of new preservation behavior and clarified --force flag purpose
CHANGELOG.md Added entry in Unreleased section documenting the new feature

@nsalvacao
Copy link
Copy Markdown
Owner Author

@copilot
Revê o PR inteiro, todas as alterações, coerência com anteriores commits para main, coerência global.

Copy link
Copy Markdown

Copilot AI commented Feb 19, 2026

@nsalvacao I've opened a new pull request, #141, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 3 commits February 19, 2026 19:23
…GELOG [Unreleased] to [0.0.49]

Co-authored-by: nsalvacao <181968287+nsalvacao@users.noreply.github.com>
…mment step

Co-authored-by: nsalvacao <181968287+nsalvacao@users.noreply.github.com>
fix(ci): ai-review always fails on full-diff path — missing review.md
@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Add smart detection to preserve existing .specify directory state by default during specify init in the current directory, with explicit --force option to overwrite. Improve chunking logic in AI review workflow.

Changes

  • Introduced detect_existing_specify_state() to detect existing .specify directory and symlinks.
  • Preserve .specify content by default when re-running specify init unless --force is used.
  • Refuse to overwrite symlinked .specify directories with --force for safety.
  • Added user-facing console panels explaining preservation behavior.
  • Added regression tests for .specify detection.
  • Updated README with usage notes on .specify preservation and --force.
  • Bumped version to 0.0.49.
  • Refactored AI review GitHub Actions workflow chunking from Python to shell using dd for better performance and error handling.
  • Minor changelog updates reflecting new behavior and tests.

Impact

  • specify init command behavior when run in directories with existing .specify changes to preserve existing project state by default.
  • Users must use --force to overwrite .specify content explicitly.
  • CI workflow for AI review chunking is more robust and efficient.
  • Documentation and tests updated to reflect new behavior.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Review Summary

1. Security Vulnerabilities

.github/workflows/ai-review.yml

  • 🔴 Critical: The chunking logic previously used an inline Python script, now replaced with Bash and dd. The new logic uses dd to split diff_full.txt into chunks. If diff_full.txt contains untrusted input, and if CHUNK_SIZE_CHARS or CHUNK_OVERLAP_CHARS are user-controlled, there is a risk of resource exhaustion (e.g., very large values could cause excessive file operations or disk usage). However, the script checks that STEP > 0, mitigating some risk.
    • Recommendation: Validate CHUNK_SIZE_CHARS and CHUNK_OVERLAP_CHARS to ensure they are within reasonable bounds before chunking.

src/specify_cli/__init__.py

  • 🔴 Critical: The new detect_existing_specify_state function checks for symlinks and refuses to overwrite symlinked .specify directories with --force. This is a strong protection against directory traversal and symlink attacks, improving security.
  • 🟡 Warning: The --skip-tls option disables SSL/TLS verification. This is dangerous and should be discouraged or accompanied by strong warnings.

2. Bugs

.github/workflows/ai-review.yml

  • 🟡 Warning: The chunking logic uses dd with bs=1, which can be slow for large files. If diff_full.txt is very large, this could cause performance issues or incomplete chunking.
    • Recommendation: Consider using more efficient chunking methods for large files.

src/specify_cli/__init__.py

  • 🔵 Info: The detect_existing_specify_state function treats unreadable directories as "existing projects". This is a reasonable fail-safe, but could cause unexpected behavior if directory permissions are misconfigured.

3. Best Practice Violations

.github/workflows/ai-review.yml

  • 🔵 Info: Inline Bash for chunking is less maintainable than Python, especially for complex logic. Consider keeping chunking logic in Python for readability and maintainability.

src/specify_cli/__init__.py

  • 🔵 Info: The preserve_specify logic and messaging are clear and user-friendly. Good use of panels and explicit instructions.
  • 🔵 Info: The changelog and README are updated to reflect new behavior, which is best practice.

tests/test_specify_detection.py

  • 🔵 Info: Good test coverage for symlink and directory detection logic.

Recommendations

  • Security: Add explicit bounds checking for chunk sizes in workflow scripts. Strongly discourage --skip-tls usage.
  • Performance: Consider chunking optimizations for large files.
  • Maintainability: Prefer Python for complex file operations in CI scripts.
  • Documentation: Continue updating docs and changelogs as done.

Overall Assessment

  • Security: Improved by symlink protection, but chunking logic needs bounds checking.
  • Bugs: None critical, but performance concerns for large files.
  • Best Practices: Mostly followed, with minor maintainability concerns.

No showstopper issues, but address chunking bounds and discourage --skip-tls for production.


🤖 AI Review · openai/gpt-4.1 · 4535 tokens · GitHub Models free tier · 0 premium requests

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 1 comment.

Comment thread .github/workflows/ai-review.yml Outdated
@nsalvacao nsalvacao changed the title feat(init): preserve existing .specify by default feat(init,ci): preserve .specify by default and harden ai-review Feb 19, 2026
@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Improved AI review GitHub Actions workflow with detailed append-only timeline logging for better debugging and traceability. Added safe .specify detection in CLI to preserve existing project state by default and introduced a --force flag to overwrite symlinked .specify directories.

Changes

  • Enhanced .github/workflows/ai-review.yml:
    • Added UTC timestamped append-only timeline logs for model calls, HTTP responses, retries, rate-limit headers, chunking progress, and final results.
    • Exported timeline logs to the GitHub Actions workflow summary.
    • Refactored chunking logic to use shell commands with timeline logging.
    • Made full-diff review output deterministic by writing review.md before posting.
  • Added safe .specify detection helper in CLI (detect_existing_specify_state) with symlink awareness.
  • CLI specify init now preserves existing .specify directory by default; requires --force to overwrite symlinked .specify.
  • Added regression tests for .specify detection (tests/test_specify_detection.py).
  • Updated CHANGELOG.md and README.md to document these changes.

Impact

  • AI review workflow logs and debugging are significantly improved for large diffs and retries.
  • CLI users benefit from safer project initialization that avoids accidental overwrites of .specify.
  • Developers gain better confidence and observability in AI-assisted code reviews and project setup.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Full diff review

Review Summary

1. Security Vulnerabilities

🔴 Critical

  • Symlink Handling in .specify Directory (src/specify_cli/init.py):
    • The new logic refuses to overwrite a symlinked .specify directory with --force, which is good. However, the check only applies if force is set. If a symlink exists and force is not set, the code proceeds to preserve the symlinked directory and update agent-specific files outside .specify. There is a risk that subsequent code (not shown here) could inadvertently write outside the intended project directory if it does not consistently check for symlinks. Mitigation: Ensure all file operations involving .specify or its contents are symlink-aware and do not traverse outside the project root.

🟡 Warning

  • Use of dd for Chunking in Workflow (ai-review.yml):

    • The chunking logic uses dd to split the diff file. If diff_full.txt contains binary or unexpected content, this could result in malformed chunks. While not a direct vulnerability, it could cause unpredictable behavior. Mitigation: Validate diff_full.txt is always text and handle errors gracefully.
  • Potential Information Disclosure in Timeline Logging (ai-review.yml):

    • The timeline logs include model names, chunk IDs, HTTP codes, and rate-limit headers. If these logs are exposed in public CI summaries, they could leak internal model usage patterns or rate-limit information. Mitigation: Review workflow summary visibility and redact sensitive information if needed.

2. Bugs

🟡 Warning

  • Chunk Count Calculation (ai-review.yml):

    • The previous Python chunking logic counted chunks and printed the count. The new shell-based chunking increments CHUNK_COUNT in a loop, but does not check for partial last chunks (if DIFF_LEN is not a multiple of STEP). This could result in an off-by-one error or empty chunk files. Mitigation: After chunking, verify all chunk files are non-empty and handle edge cases.
  • Symlink Detection Logic (src/specify_cli/init.py):

    • The function detect_existing_specify_state treats unreadable directories as "existing projects." This is a safe default, but could cause false positives if directory permissions are misconfigured. Mitigation: Consider logging a warning if unreadable directories are detected.

🔵 Info

  • Backward Compatibility:

    • The new preserve_specify argument is added to download_and_extract_template, but only used in the new logic. Ensure all callers are updated to pass this argument as needed.
  • Workflow Timeline File Handling:

    • The timeline file is created and appended in the workflow. If multiple jobs run in parallel, there could be race conditions. However, in this context, jobs are sequential.

3. Best Practice Violations

🟡 Warning

  • Direct Use of dd for Text Chunking (ai-review.yml):

    • Using dd for text files is not portable across all platforms and may not handle multibyte characters correctly. Mitigation: Prefer a Python or POSIX-compliant approach for chunking text files.
  • Error Handling in Workflow Steps:

    • The workflow does not always check for errors after chunk creation (e.g., if dd fails). Mitigation: Add error checks after chunking.

🔵 Info

  • Logging and Debugging Improvements:

    • The append-only timeline logging is a good practice for traceability and debugging.
  • Symlink Safety:

    • The explicit refusal to overwrite symlinked .specify directories is a strong safety improvement.

Recommendations

  • Security: Review all file operations involving .specify for symlink traversal vulnerabilities.
  • Chunking: Consider reverting to a Python-based chunking approach for portability and correctness.
  • Logging: Ensure timeline logs do not expose sensitive information in public CI runs.
  • Testing: Add tests for edge cases in chunking and symlink detection.

Overall Assessment

  • Security: Mostly improved, but symlink traversal must be carefully audited.
  • Reliability: Chunking logic is simpler but may have edge case bugs.
  • Best Practices: Logging and symlink handling are improved; chunking could be more robust.

No critical vulnerabilities found, but some warnings and best practice issues should be addressed before merging.


🤖 AI Review · openai/gpt-4.1 · 6703 tokens · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Enhanced the AI review GitHub Actions workflow with detailed timeline logging for requests, responses, retries, and chunking steps, improving traceability and debugging. Also updated the preferred long-context AI model and added new tests for .specify detection improvements.

Changes

  • Switched AI review long-context model to ai21-labs/ai21-jamba-1.5-large.
  • Added append-only timeline logging with UTC timestamps for all major steps in the AI review workflow (requests, responses, retries, chunk creation, and completion).
  • Logged rate-limit and retry headers in timeline for better diagnostics.
  • Improved chunking logic with timeline events for chunk creation and processing.
  • Appended the timeline log to the GitHub Actions workflow summary for easier debugging.
  • Added safe .specify detection preserving existing project state and symlink-aware behavior.
  • Added regression tests for .specify detection and bootstrap behavior.
  • Updated changelog with upstream hardening details.

Impact

  • AI review workflow now provides detailed, timestamped logs of its internal operations, aiding maintainers in troubleshooting and performance tuning.
  • Users running specify init benefit from safer project state detection, avoiding accidental overwrites of symlinked .specify directories.
  • New tests improve reliability and prevent regressions in project bootstrap and detection logic.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Full diff review

Review of Pull Request Diff

Security Vulnerabilities

  1. Hardcoded Model IDs

    REVIEW_LONG_CONTEXT_MODEL: "ai21-labs/ai21-jamba-1.5-large"

    🔴 Critical
    The model ID is hardcoded, which could lead to unintended usage if the model is deprecated or replaced. This should be configurable via environment variables or secrets to allow dynamic updates without code changes.

  2. Missing Validation for GH_MODELS_TOKEN

    -H "Authorization: Bearer $GH_MODELS_TOKEN"

    🔴 Critical
    The GH_MODELS_TOKEN is used directly without validation or sanitization. If the token is invalid or compromised, it could lead to unauthorized access or abuse. Add validation checks and ensure the token is securely stored using GitHub secrets.

  3. Symlink Handling

    if specify_root.is_symlink():
        ...

    🟡 Warning
    Symlink handling is implemented to prevent overwriting, but it does not account for malicious symlinks pointing to sensitive directories (e.g., /etc). Add checks to ensure the symlink target is within the expected project directory.

  4. HTTP Error Handling

    if [ "$HTTP_CODE" = "400" ] || [ "$HTTP_CODE" = "401" ] || [ "$HTTP_CODE" = "403" ] || [ "$HTTP_CODE" = "404" ]; then

    🟡 Warning
    The error handling for HTTP codes does not include logging sensitive details (e.g., token usage or request payload). Ensure sensitive information is excluded from logs to prevent leakage.


Bugs

  1. Invalid Chunk Configuration

    if [ "$STEP" -le 0 ]; then
        echo "::error::Invalid chunk config: CHUNK_SIZE_CHARS must be greater than CHUNK_OVERLAP_CHARS"
        exit 1
    fi

    🔴 Critical
    The chunk configuration validation is correct, but the error message does not provide actionable guidance. Suggest logging the actual values of CHUNK_SIZE_CHARS and CHUNK_OVERLAP_CHARS for debugging.

  2. Retry Logic for HTTP 429

    RETRY_AFTER=$(echo "$BODY" | jq -r '.retry_after // empty' 2>/dev/null || true)

    🟡 Warning
    The retry logic does not account for exponential backoff, which is a best practice for handling rate limits. Implement exponential backoff to reduce the risk of further rate-limiting.

  3. Symlink Detection in Tests

    @pytest.mark.skipif(not hasattr(Path, "symlink_to"), reason="symlink support unavailable")

    🔵 Info
    The test skips symlink-related checks on unsupported platforms, but it does not log the skipped tests. Add logging to indicate skipped tests for better traceability.


Best Practice Violations

  1. Timeline Logging

    timeline "AI review started full_diff_chars=${{ steps.diff.outputs.full_size }} chunk_required=$CHUNK_REQUIRED"

    🟡 Warning
    While timeline logging is useful, the logs are verbose and could clutter debugging. Consider adding log levels (e.g., INFO, WARNING, ERROR) to filter logs based on severity.

  2. Magic Numbers for Retry Attempts

    MAX_RETRY: "3"

    🟡 Warning
    The retry attempts are hardcoded as 3. Use environment variables or configuration files to allow dynamic adjustment.

  3. Error Message Length

    ERROR_MSG=$(echo "$BODY" | jq -r '.error.message // empty' 2>/dev/null | tr '\n' ' ' | cut -c1-180)

    🔵 Info
    Truncating error messages to 180 characters may omit critical details. Consider logging the full message with a warning about truncation.

  4. Version Management

    version = "0.0.52"

    🔵 Info
    The version bump is consistent, but it lacks a changelog entry explaining the rationale for the new version. Ensure all version changes are documented in the changelog.


Recommendations

  1. Security Enhancements

    • Validate and sanitize GH_MODELS_TOKEN.
    • Implement checks to prevent malicious symlinks.
    • Ensure sensitive data is excluded from logs.
  2. Bug Fixes

    • Improve error messages for chunk configuration validation.
    • Implement exponential backoff for retries.
  3. Best Practices

    • Use dynamic configuration for retry attempts and model IDs.
    • Add log levels to timeline logging for better debugging.

Overall, the pull request introduces useful features, but it requires critical fixes for security and reliability.


🤖 AI Review · openai/gpt-4o · 8422 tokens · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Enhanced the AI review GitHub workflow with detailed timeline logging, model catalog validation, improved retry/backoff logic, and safer environment variable handling for model IDs.

Changes

  • Added timeline logs with UTC timestamps for better debugging and audit trails.
  • Validated configured AI model IDs against the official runtime catalog, disabling invalid models.
  • Improved retry logic with detailed logging of HTTP responses, rate-limit headers, and error codes.
  • Switched model environment variables to use repository variables with fallbacks.
  • Added warnings and errors for missing secrets and invalid chunk configuration.
  • Refined chunking logic and model selection to avoid duplicates.
  • Updated documentation and changelog accordingly.

Impact

  • AI review workflow is more robust, transparent, and easier to debug.
  • Model usage aligns with the official catalog, reducing runtime errors.
  • Users must configure MODELS_PAT secret and optionally repository variables for models.
  • Review logs now include detailed timeline files for each run.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Chunk 01

Review of Diff

.github/workflows/ai-review.yml

Security Vulnerabilities

  • 🔴 Token Exposure Prevention
    The diff correctly masks the GH_MODELS_TOKEN using ::add-mask::. No direct exposure of secrets in logs or outputs.
    No critical vulnerabilities found.

  • 🔴 Model Catalog Pre-check
    The workflow fetches the model catalog using a token and checks model existence before inference.
    No critical vulnerabilities found.

Bugs

  • 🟡 Model Deduplication Logic
    The deduplication loop for MODELS in review_with_chain is correct, but if all configured models are missing or empty, the workflow logs and returns, which is good.
    No bugs found in this logic.

  • 🟡 Chunking Logic Change
    The previous Python chunking is replaced with shell logic using dd.

    • If diff_full.txt contains multibyte characters (e.g., UTF-8), dd with bs=1 may split characters, resulting in invalid chunks.
      Severity: 🟡 Warning
      Recommendation: Use a chunking method that respects character boundaries (e.g., Python with UTF-8 handling).
  • 🟡 Timeline Logging
    Timeline logs are appended and exported.
    No bugs found.

  • 🟡 Error Handling

    • The workflow checks for invalid chunk config (STEP <= 0).
    • Handles HTTP errors and rate limits robustly.
      No bugs found.

Best Practice Violations

  • 🔵 Shell Script Robustness

    • set -euo pipefail is used, which is good.
    • Timeline logging is clear and append-only.
    • Use of dd for chunking is not portable across platforms and may not handle large files efficiently or safely with multibyte text.
      Severity: 🟡 Warning
      Recommendation: Prefer Python for chunking, as in the previous implementation, to ensure correctness and portability.
  • 🔵 Model Configuration

    • Model IDs are now configurable via repository variables, which is a best practice for flexibility.
    • Catalog pre-check avoids wasted inference attempts.
      No violations found.
  • 🔵 Error Messaging

    • Error messages are clear and actionable.
    • Timeline logs are exported to workflow summary for debugging.
      No violations found.
  • 🔵 Versioning/Changelog

    • Changelog and version updates are clear and follow semantic versioning.
      No violations found.

Summary Table

Issue Severity Recommendation
dd chunking splits multibyte chars 🟡 Warning Use Python with UTF-8 handling for chunking
dd not portable for text chunking 🟡 Warning Prefer Python for chunking
No critical security bugs found 🔵 Info N/A
Model configuration and error handling improved 🔵 Info N/A

Recommendations

  • Replace shell dd chunking with Python (UTF-8 safe):
    • The previous Python chunking logic was safer for text files. Shell dd can corrupt multibyte characters and is less portable.
  • Continue robust error handling and timeline logging.
  • No critical vulnerabilities or bugs found otherwise.

Example Python Chunking (UTF-8 Safe)

import math, pathlib, sys
CHUNK=int("${CHUNK_SIZE_CHARS}")
OVERLAP=int("${CHUNK_OVERLAP_CHARS}")
data=pathlib.Path('diff_full.txt').read_text(encoding='utf-8')
out_dir=pathlib.Path('.')
if CHUNK <= OVERLAP:
    print("Invalid chunk/overlap config", file=sys.stderr)
    sys.exit(1)
start=0
idx=1
count=0
while start < len(data):
    end = start + CHUNK
    chunk = data[start:end]
    out_dir.joinpath(f"diff_chunk_{idx:02d}.txt").write_text(chunk, encoding='utf-8')
    start = end - OVERLAP
    idx += 1
    count += 1
print(count)

Overall, this diff is a solid improvement, but the chunking logic should be reverted to a Python-based approach for correctness and portability.

Chunk 02

Security Vulnerabilities

🔴 Critical

  • Symlink Handling in .specify Directory
    The new logic correctly refuses to overwrite a symlinked .specify directory when --force is used, preventing a potential security issue where files could be written outside the intended project directory.
    No critical vulnerabilities found in symlink handling.

🟡 Warning

  • Potential Directory Traversal
    The code checks for symlinks and skips operations, but if other parts of the code (not shown here) ever follow symlinks or allow user input to specify paths, there could be directory traversal risks.
    Mitigation: The current diff mitigates this for .specify, but review other file operations for similar risks.

  • Use of --skip-tls Option
    The --skip-tls option disables SSL/TLS verification, which is flagged as "not recommended".
    Risk: Users may inadvertently expose themselves to MITM attacks if used.
    Suggestion: Consider requiring explicit confirmation or warning before allowing this option.

Bugs

🟡 Warning

  • Symlink Detection Logic
    The detect_existing_specify_state function treats unreadable directories as "existing projects".
    Risk: If a directory is unreadable due to permissions, it may block legitimate operations.
    Suggestion: This is a reasonable fail-safe, but document this behavior for users.

  • Preserve Logic for .specify
    When preserve_specify is True, the code skips copying files into .specify.
    Risk: If agent-specific files outside .specify depend on .specify content, this may cause inconsistencies.
    Suggestion: Ensure downstream logic handles this mode gracefully.

🔵 Info

  • Test Coverage
    Tests for symlink handling are skipped if symlink support is unavailable.
    Note: This is appropriate, but ensure CI environments support symlinks for full coverage.

Best Practice Violations

🟡 Warning

  • Verbose Option Handling
    The code checks verbose and not tracker before printing.
    Suggestion: Consider centralizing logging/printing logic for maintainability.

  • Panel Messaging
    User-facing messages are clear, but consider adding logging for these events for auditability.

🔵 Info

  • Function Parameter Growth
    The download_and_extract_template function has many parameters.
    Suggestion: Consider refactoring to use a config object or dataclass for readability.

  • Type Annotations
    Good use of type annotations throughout.

  • Test Naming
    Test names are descriptive and follow conventions.

Summary

  • Symlink overwrite protection is robust and mitigates a critical security risk.
  • No major bugs found; minor warnings around directory traversal and preserve logic.
  • Best practices mostly followed; consider refactoring parameter-heavy functions and improving logging.
  • Tests are thorough for new symlink logic.

Overall: This diff improves security and robustness. No critical vulnerabilities found.


🤖 AI Review · openai/gpt-4.1 · 10381 tokens · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Enhanced the AI review GitHub workflow to support A/B model selection, improved logging with a detailed review timeline, and added robust model catalog validation and retry logic.

Changes

  • Added environment variables for A/B testing mode (REVIEW_AB_MODE, REVIEW_AB_MODELS) and tenant-specific model overrides.
  • Implemented fetching and validation of models against the GitHub Models catalog.
  • Introduced A/B model selection based on PR number parity.
  • Added detailed timeline logging for review process steps, API calls, rate limits, retries, and errors.
  • Improved retry logic with backoff and handling of non-retryable HTTP errors.
  • Refined model selection to avoid duplicates and fallback gracefully.
  • Updated workflow to fail early if the required GitHub Models token is missing.
  • Minor fixes and enhancements in review output formatting and token usage logging.

Impact

  • AI code review workflow (.github/workflows/ai-review.yml) now supports configurable A/B testing and better diagnostics.
  • Developers benefit from more reliable, transparent, and configurable AI reviews.
  • No changes to source code or tests functionality beyond adding a new test file for detection.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Chunk 01

Security Review

🔴 Missing Input Validation: The REVIEW_AB_MODELS variable is not validated for potential command injection attacks. Although it's used in a controlled environment, it's still a good practice to validate user-input data.

🟡 Sensitive Information Exposure: The GH_MODELS_TOKEN is masked to prevent accidental log exposure. However, it's still possible to leak sensitive information through other means, such as debug logs or error messages.

Bug Review

🔵 Potential Division by Zero: In the review_with_chain function, there's a potential division by zero error when calculating the INDEX for A/B model selection. Although it's checked if REVIEW_AB_MODELS is not empty, it's still possible to have an empty AB_MODELS array after filtering out unavailable models.

🔵 Uninitialized Variable: The USED_MODEL variable is not initialized before being used in the review_with_chain function. Although it's assigned a value when the review is successful, it's still possible to use an uninitialized variable if the review fails.

Best Practice Review

🔵 Code Organization: The workflow file is quite long and complex. Consider breaking it down into smaller, more manageable files or using a more modular approach.

🔵 Error Handling: Although there are some error handling mechanisms in place, it's still possible to improve error handling and logging to make it more robust and informative.

🔵 Variable Naming: Some variable names, such as CHUNK_REQUIRED and CATALOG_OK, are not very descriptive. Consider using more descriptive names to improve code readability.

Recommendations

  1. Validate REVIEW_AB_MODELS to prevent potential command injection attacks.
  2. Improve error handling and logging to make it more robust and informative.
  3. Consider breaking down the workflow file into smaller, more manageable files or using a more modular approach.
  4. Use more descriptive variable names to improve code readability.
  5. Initialize variables before using them to prevent potential errors.

The changes in the CHANGELOG.md file seem to be properly documented and follow standard changelog formatting. The updates to the ai-review.yml workflow file include several improvements, such as configurable model selection, A/B model testing, and improved logging and error handling. However, there are some areas that can be improved, as mentioned above.

Chunk 02

Security Review

  1. 🔴 Potential Symlink Attack: The code now handles symlinked .specify directories by skipping bootstrap mutations. However, it's crucial to ensure that the symlink is properly handled to avoid potential attacks. The current implementation checks if .specify is a symlink and skips modifications, which is a good practice.

  2. 🟡 Missing Input Validation: The detect_existing_specify_state function does not validate if the provided specify_dir is a valid path. Although it's used internally with a constructed path, it's still a good practice to validate inputs.

Bug Review

  1. 🔵 POSIX Dependency: The ensure_executable_scripts function is skipped on Windows due to the os.name == "nt" check. While this is expected behavior, it's worth noting that the function will not be executed on Windows.

  2. 🔵 Potential OSError: The detect_existing_specify_state function catches OSError when checking if a directory is empty. This is a good practice to handle potential OS-related errors.

Best Practice Review

  1. 🔵 Type Hints and Documentation: The code uses type hints and docstrings, which is a good practice for maintaining readability and understandability.

  2. 🟡 Test Coverage: The code has added tests for the new functionality, including tests for symlinked .specify directories. However, it's essential to ensure that the tests cover all possible scenarios.

  3. 🔵 Code Organization: The code is well-organized, and the new functionality is properly integrated into the existing codebase.

Additional Comments

  • The code changes seem to be well-structured and follow best practices.
  • The addition of tests for symlinked .specify directories is a good practice to ensure the code handles this scenario correctly.
  • It's worth considering adding more tests to cover different edge cases and ensure the code is robust.

Overall, the code changes appear to be well-done, and the added tests provide good coverage for the new functionality. However, it's always a good idea to review the code thoroughly and consider additional testing to ensure the code is robust and secure.

Chunk 03

Review of the Pull Request Diff

Security Vulnerabilities

  • 🔴 Critical: The code creates a bash script with the content echo ok and makes it executable. However, the script is initially created with permissions 0o644, which is not executable. Later, the ensure_executable_scripts function is called, which might change the permissions. Nevertheless, it's not clear what this function does. If it doesn't make the script executable, an attacker could potentially exploit this by replacing the script before it's executed. It's recommended to directly set the correct permissions when creating the script or verify the behavior of ensure_executable_scripts.
  • 🟡 Warning: The .specify symlink is created pointing to an external location (external_specify). This could potentially be a security risk if external_specify is not properly validated or if it's under the control of an attacker.

Bugs

  • 🟡 Warning: The script_path.write_text call does not handle potential exceptions that might occur during file writing. It's a good practice to handle potential exceptions.
  • 🟡 Warning: The ensure_executable_scripts function's behavior is not clear. If it's supposed to make scripts executable, the assertion after it checks for 0o644, which is not executable. This might indicate a bug or a misunderstanding.

Best Practice Violations

  • 🔵 Info: The code uses exist_ok=True when creating the parent directory of script_path. This is a good practice to avoid raising an exception if the directory already exists.
  • 🔵 Info: The code uses encoding="utf-8" when writing to script_path, which is a good practice for ensuring consistent encoding.
  • 🔵 Info: The uv.lock file has been updated to reference a newer version of specify-cli. It's not clear from the diff alone whether this update is necessary or if it introduces any compatibility issues.

Recommendations

  1. Review the behavior and implementation of ensure_executable_scripts to ensure it aligns with the expected permissions for the script.
  2. Validate the external_specify path to ensure it's not under the control of an attacker and is properly sanitized.
  3. Consider directly setting the correct permissions when creating the script to avoid potential security issues.
  4. Handle potential exceptions during file operations.
  5. Verify that the update to specify-cli version 0.0.52 is necessary and does not introduce compatibility issues.

🤖 AI Review · meta/llama-4-maverick-17b-128e-instruct-fp8 · 10770 tokens · GitHub Models free tier · 0 premium requests

@nsalvacao nsalvacao merged commit aee3a75 into main Feb 19, 2026
7 checks passed
@nsalvacao nsalvacao deleted the feat/72-smart-specify-detection-main branch February 19, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants