feat(init,ci): preserve .specify by default and harden ai-review#140
feat(init,ci): preserve .specify by default and harden ai-review#140
Conversation
Adds smart detection for existing .specify state during init in current directory, preserves by default, and hardens symlink handling for --force.
Summary of ChangesHello @nsalvacao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 AI PR SummarySummaryEnhance Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
There was a problem hiding this comment.
Code Review
The pull request introduces significant improvements to the specify init command, enhancing its behavior when re-initializing projects. It now preserves existing .specify/ content by default, adds symlink-aware detection to prevent accidental overwrites, and refines the --force option for intentional reinitialization. The changes also include updating documentation, bumping the version, and adding regression tests for the new detection logic. The code is well-structured and addresses a critical usability and safety concern for existing projects.
There was a problem hiding this comment.
Pull request overview
This PR implements smart detection to preserve existing .specify/ project state when running specify init in a directory that already has a spec-kit project. The implementation adds safeguards against accidental overwrites while providing an explicit --force option for intentional reinitialization. This change supersedes PR #72 (intake lane) with a clean implementation based on main.
Changes:
- Added
detect_existing_specify_state()helper with symlink-aware detection - Modified
initcommand to preserve.specify/by default when it contains existing content - Added explicit safety guard that refuses
--forceoverwrite of symlinked.specifydirectories - Updated CLI parameter descriptions and user-facing messages
- Added comprehensive unit tests for detection logic
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/specify_cli/__init__.py |
Added detect_existing_specify_state() helper, updated download_and_extract_template() signature, and integrated smart detection logic into init() command with clear user messaging |
tests/test_specify_detection.py |
Added comprehensive unit tests covering missing paths, empty directories, non-empty directories, and symlinks |
pyproject.toml |
Bumped version from 0.0.35 to 0.0.36 |
uv.lock |
Updated version lock to 0.0.36 |
README.md |
Updated documentation with examples of new preservation behavior and clarified --force flag purpose |
CHANGELOG.md |
Added entry in Unreleased section documenting the new feature |
|
@copilot |
|
@nsalvacao I've opened a new pull request, #141, to work on those changes. Once the pull request is ready, I'll request review from you. |
…GELOG [Unreleased] to [0.0.49] Co-authored-by: nsalvacao <181968287+nsalvacao@users.noreply.github.com>
…mment step Co-authored-by: nsalvacao <181968287+nsalvacao@users.noreply.github.com>
fix(ci): ai-review always fails on full-diff path — missing review.md
📝 AI PR SummarySummaryAdd smart detection to preserve existing Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
🔍 AI Code ReviewReview Summary1. Security Vulnerabilities
|
📝 AI PR SummarySummaryImproved AI review GitHub Actions workflow with detailed append-only timeline logging for better debugging and traceability. Added safe Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
🔍 AI Code ReviewFull diff reviewReview Summary1. Security Vulnerabilities🔴 Critical
🟡 Warning
2. Bugs🟡 Warning
🔵 Info
3. Best Practice Violations🟡 Warning
🔵 Info
Recommendations
Overall Assessment
No critical vulnerabilities found, but some warnings and best practice issues should be addressed before merging. 🤖 AI Review · openai/gpt-4.1 · 6703 tokens · GitHub Models free tier · 0 premium requests |
📝 AI PR SummarySummaryEnhanced the AI review GitHub Actions workflow with detailed timeline logging for requests, responses, retries, and chunking steps, improving traceability and debugging. Also updated the preferred long-context AI model and added new tests for Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
🔍 AI Code ReviewFull diff reviewReview of Pull Request DiffSecurity Vulnerabilities
Bugs
Best Practice Violations
Recommendations
Overall, the pull request introduces useful features, but it requires critical fixes for security and reliability. 🤖 AI Review · openai/gpt-4o · 8422 tokens · GitHub Models free tier · 0 premium requests |
📝 AI PR SummarySummaryEnhanced the AI review GitHub workflow with detailed timeline logging, model catalog validation, improved retry/backoff logic, and safer environment variable handling for model IDs. Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
🔍 AI Code ReviewChunk 01Review of Diff
|
| Issue | Severity | Recommendation |
|---|---|---|
dd chunking splits multibyte chars |
🟡 Warning | Use Python with UTF-8 handling for chunking |
dd not portable for text chunking |
🟡 Warning | Prefer Python for chunking |
| No critical security bugs found | 🔵 Info | N/A |
| Model configuration and error handling improved | 🔵 Info | N/A |
Recommendations
- Replace shell
ddchunking with Python (UTF-8 safe):- The previous Python chunking logic was safer for text files. Shell
ddcan corrupt multibyte characters and is less portable.
- The previous Python chunking logic was safer for text files. Shell
- Continue robust error handling and timeline logging.
- No critical vulnerabilities or bugs found otherwise.
Example Python Chunking (UTF-8 Safe)
import math, pathlib, sys
CHUNK=int("${CHUNK_SIZE_CHARS}")
OVERLAP=int("${CHUNK_OVERLAP_CHARS}")
data=pathlib.Path('diff_full.txt').read_text(encoding='utf-8')
out_dir=pathlib.Path('.')
if CHUNK <= OVERLAP:
print("Invalid chunk/overlap config", file=sys.stderr)
sys.exit(1)
start=0
idx=1
count=0
while start < len(data):
end = start + CHUNK
chunk = data[start:end]
out_dir.joinpath(f"diff_chunk_{idx:02d}.txt").write_text(chunk, encoding='utf-8')
start = end - OVERLAP
idx += 1
count += 1
print(count)Overall, this diff is a solid improvement, but the chunking logic should be reverted to a Python-based approach for correctness and portability.
Chunk 02
Security Vulnerabilities
🔴 Critical
- Symlink Handling in
.specifyDirectory
The new logic correctly refuses to overwrite a symlinked.specifydirectory when--forceis used, preventing a potential security issue where files could be written outside the intended project directory.
No critical vulnerabilities found in symlink handling.
🟡 Warning
-
Potential Directory Traversal
The code checks for symlinks and skips operations, but if other parts of the code (not shown here) ever follow symlinks or allow user input to specify paths, there could be directory traversal risks.
Mitigation: The current diff mitigates this for.specify, but review other file operations for similar risks. -
Use of
--skip-tlsOption
The--skip-tlsoption disables SSL/TLS verification, which is flagged as "not recommended".
Risk: Users may inadvertently expose themselves to MITM attacks if used.
Suggestion: Consider requiring explicit confirmation or warning before allowing this option.
Bugs
🟡 Warning
-
Symlink Detection Logic
Thedetect_existing_specify_statefunction treats unreadable directories as "existing projects".
Risk: If a directory is unreadable due to permissions, it may block legitimate operations.
Suggestion: This is a reasonable fail-safe, but document this behavior for users. -
Preserve Logic for
.specify
Whenpreserve_specifyisTrue, the code skips copying files into.specify.
Risk: If agent-specific files outside.specifydepend on.specifycontent, this may cause inconsistencies.
Suggestion: Ensure downstream logic handles this mode gracefully.
🔵 Info
- Test Coverage
Tests for symlink handling are skipped if symlink support is unavailable.
Note: This is appropriate, but ensure CI environments support symlinks for full coverage.
Best Practice Violations
🟡 Warning
-
Verbose Option Handling
The code checksverbose and not trackerbefore printing.
Suggestion: Consider centralizing logging/printing logic for maintainability. -
Panel Messaging
User-facing messages are clear, but consider adding logging for these events for auditability.
🔵 Info
-
Function Parameter Growth
Thedownload_and_extract_templatefunction has many parameters.
Suggestion: Consider refactoring to use a config object or dataclass for readability. -
Type Annotations
Good use of type annotations throughout. -
Test Naming
Test names are descriptive and follow conventions.
Summary
- Symlink overwrite protection is robust and mitigates a critical security risk.
- No major bugs found; minor warnings around directory traversal and preserve logic.
- Best practices mostly followed; consider refactoring parameter-heavy functions and improving logging.
- Tests are thorough for new symlink logic.
Overall: This diff improves security and robustness. No critical vulnerabilities found.
🤖 AI Review · openai/gpt-4.1 · 10381 tokens · GitHub Models free tier · 0 premium requests
📝 AI PR SummarySummaryEnhanced the AI review GitHub workflow to support A/B model selection, improved logging with a detailed review timeline, and added robust model catalog validation and retry logic. Changes
Impact
🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests |
🔍 AI Code ReviewChunk 01Security Review🔴 Missing Input Validation: The 🟡 Sensitive Information Exposure: The Bug Review🔵 Potential Division by Zero: In the 🔵 Uninitialized Variable: The Best Practice Review🔵 Code Organization: The workflow file is quite long and complex. Consider breaking it down into smaller, more manageable files or using a more modular approach. 🔵 Error Handling: Although there are some error handling mechanisms in place, it's still possible to improve error handling and logging to make it more robust and informative. 🔵 Variable Naming: Some variable names, such as Recommendations
The changes in the Chunk 02Security Review
Bug Review
Best Practice Review
Additional Comments
Overall, the code changes appear to be well-done, and the added tests provide good coverage for the new functionality. However, it's always a good idea to review the code thoroughly and consider additional testing to ensure the code is robust and secure. Chunk 03Review of the Pull Request DiffSecurity Vulnerabilities
Bugs
Best Practice Violations
Recommendations
🤖 AI Review · meta/llama-4-maverick-17b-128e-instruct-fp8 · 10770 tokens · GitHub Models free tier · 0 premium requests |
Summary
.specify/content by default when runningspecify initin an existing project--forceoverwrite for symlinked.specify--forcepath for intentional reinitialization.github/workflows/ai-review.ymlfor large diffs:review.mdcreation in full-diff mode before posting PR commentsreview_timeline.md) for model attempts, HTTP status, retries, rate-limit headers, and chunk progressCHANGELOG.mdwith the actual PR scope (init+ci)Credit
@DanielHashmi)main-based clean laneWhy this PR
This supersedes the conflicted intake lane in #72 with a clean branch based on
main, somainremains the source of truth and CI/workflow behavior is consistent.Validation
uv run pytest tests/test_specify_detection.py --tb=shortuv run pytest tests/ --tb=shortnpx markdownlint-cli2 README.md CHANGELOG.mdpython3 - <<'PY' ... yaml.safe_load('.github/workflows/ai-review.yml') ... PYuv run specify check