Add repository package discovery script#89
Conversation
Docker-based tool that queries pgEdge DNF and APT repositories, diffs discovered packages against catalog.json, and interactively walks through adding new packages to the catalog. Features: - Queries both pgedge and pgedge-noarch EL repos + DEB repo - Shell glob exclusion patterns for non-user-facing packages - PG-version detection (standalone vs versioned) - Meta-package membership from DEB deps (authoritative source) - Interactive walkthrough with category/name/description prompts - Self-test suite (5 checks, no Docker required) - Flags: --discover-only, --verbose, --el-only, --deb-only Run: ./scripts/discover-repo-packages.sh --discover-only Test: ./scripts/discover-repo-packages.sh --self-test
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
Deploying pgedge-docs with
|
| Latest commit: |
df6ba61
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://c3f13c5a.pgedge-docs.pages.dev |
| Branch Preview URL: | https://feat-repo-discovery-script.pgedge-docs.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
scripts/discover-repo-packages.sh (3)
1027-1027: Stale TODO marker — confirm intent before merging.
# Additional tests added in later tasksreads like a hand-off note from PR drafting. If no follow-up tests are planned for this PR, remove it; otherwise capture the work in a tracking issue so it doesn't linger.Would you like me to open a follow-up issue describing the remaining self-test coverage (e.g., catalog version-expansion, exclusion-glob edge cases) so this comment can be removed?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/discover-repo-packages.sh` at line 1027, The inline TODO comment "# Additional tests added in later tasks" is stale—either remove it or replace it with a tracked work item; confirm whether more tests will be added for self-test coverage (catalog version-expansion, exclusion-glob edge cases) and if not delete the line from scripts/discover-repo-packages.sh, otherwise create a follow-up issue and replace the comment with the issue number or URL so the intent is captured and won't linger; look for the exact string "# Additional tests added in later tasks" to locate and update the comment.
668-669: 💤 Low valueSC2188: use
: > fileinstead of a bare redirection.
> "$file"with no command is a quirk that works in bash but is flagged by shellcheck and isn't portable to other POSIX shells. Replace with the standard: > "$file"idiom in bothdiscover_elanddiscover_deb.♻️ Proposed fix (apply at both locations)
- > "$output_dir/rocky9-packages.txt" - > "$output_dir/rocky9-metapkg-deps.txt" + : > "$output_dir/rocky9-packages.txt" + : > "$output_dir/rocky9-metapkg-deps.txt"- > "$output_dir/debian12-packages.txt" - > "$output_dir/debian12-metapkg-deps.txt" + : > "$output_dir/debian12-packages.txt" + : > "$output_dir/debian12-metapkg-deps.txt"Also applies to: 717-718
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/discover-repo-packages.sh` around lines 668 - 669, The script uses bare redirections like > "$output_dir/rocky9-packages.txt" (and similarly for "$output_dir/rocky9-metapkg-deps.txt") inside the discover_el and discover_deb logic, which ShellCheck SC2188 flags as non-portable; replace each bare redirection with the portable no-op truncation form : > "$output_dir/rocky9-packages.txt" and : > "$output_dir/rocky9-metapkg-deps.txt" (and the analogous two occurrences around lines noted for discover_deb) so files are created/truncated portably.
458-458: ⚡ Quick winHardcoded
18will silently regress when a new PG major ships.
detect_pg_versionsalready enumerates("16" "17" "18")(lines 89, 160), the metapkgrepoquery/apt-cache dependscalls target…_18/…-18directly, and the{ver}expansion ininteractive_walkthroughuses18as the canonical resolution. The moment PG 19 lands, all three places will need to be updated in lockstep or membership detection will quietly miss the new variants.Centralize this — for example a single
DEFAULT_PG_VERSIONconstant near the top (sourced fromdefault_pg_versionin catalog.json if available), and reference it indiscover_el,discover_deb, andinteractive_walkthrough.Also applies to: 660-662, 709-711
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/discover-repo-packages.sh` at line 458, Replace the hardcoded "18" by introducing a single DEFAULT_PG_VERSION variable near the top (set it from catalog.json's default_pg_version if present, otherwise derive the latest entry from detect_pg_versions), then update usage sites: change the rpm_name expansion that sets check_name (where rpm_name//\{ver\}/18), the metapkg repoquery/apt-cache depends constructions in discover_el and discover_deb, and the {ver} resolution in interactive_walkthrough to use ${DEFAULT_PG_VERSION}; keep detect_pg_versions unchanged as the authoritative list but reference DEFAULT_PG_VERSION for canonical resolution so new PG majors don’t require multiple edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/discover-repo-packages.sh`:
- Around line 202-211: The pkg_in_metapkg function (and the same loose check
used in interactive_walkthrough for in_full/in_minimal classification) currently
treats any dependency containing $pkg as a match, causing false positives;
change the matching logic to only accept exact tokens or clearly delimited
variants (e.g., match "$dep" == "$pkg" or "$dep" == "${pkg}_*" or split dep
tokens and compare each token exactly) so versioned or suffixed names like
pkg_... are matched intentionally and not by arbitrary substring containment;
update pkg_in_metapkg and the corresponding checks in interactive_walkthrough to
use this stricter comparison.
- Around line 53-58: The load_exclusions loop currently uses line="${line// /}"
which removes all internal spaces and ignores tabs, breaking patterns with
leading tabs; change the trimming logic in load_exclusions so you only strip
leading and trailing whitespace (spaces and tabs) and preserve internal spaces
when populating EXCLUSION_PATTERNS. Replace the single global-space removal with
a two-step trim using POSIX parameter expansion (remove leading whitespace, then
trailing whitespace) before the empty-check and appending to EXCLUSION_PATTERNS;
keep the existing comment-stripping step (line="${line%%#*}") and the rest of
the loop unchanged so is_excluded can match correctly.
---
Nitpick comments:
In `@scripts/discover-repo-packages.sh`:
- Line 1027: The inline TODO comment "# Additional tests added in later tasks"
is stale—either remove it or replace it with a tracked work item; confirm
whether more tests will be added for self-test coverage (catalog
version-expansion, exclusion-glob edge cases) and if not delete the line from
scripts/discover-repo-packages.sh, otherwise create a follow-up issue and
replace the comment with the issue number or URL so the intent is captured and
won't linger; look for the exact string "# Additional tests added in later
tasks" to locate and update the comment.
- Around line 668-669: The script uses bare redirections like >
"$output_dir/rocky9-packages.txt" (and similarly for
"$output_dir/rocky9-metapkg-deps.txt") inside the discover_el and discover_deb
logic, which ShellCheck SC2188 flags as non-portable; replace each bare
redirection with the portable no-op truncation form : >
"$output_dir/rocky9-packages.txt" and : > "$output_dir/rocky9-metapkg-deps.txt"
(and the analogous two occurrences around lines noted for discover_deb) so files
are created/truncated portably.
- Line 458: Replace the hardcoded "18" by introducing a single
DEFAULT_PG_VERSION variable near the top (set it from catalog.json's
default_pg_version if present, otherwise derive the latest entry from
detect_pg_versions), then update usage sites: change the rpm_name expansion that
sets check_name (where rpm_name//\{ver\}/18), the metapkg repoquery/apt-cache
depends constructions in discover_el and discover_deb, and the {ver} resolution
in interactive_walkthrough to use ${DEFAULT_PG_VERSION}; keep detect_pg_versions
unchanged as the authoritative list but reference DEFAULT_PG_VERSION for
canonical resolution so new PG majors don’t require multiple edits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: aac0f51a-6f47-41ac-9b44-7a05e09693fa
📒 Files selected for processing (10)
.gitignorescripts/discover-repo-packages.shscripts/excluded-packages.txtscripts/test-fixtures/catalog-test.jsonscripts/test-fixtures/deb-metapkg-deps.txtscripts/test-fixtures/deb-packages.txtscripts/test-fixtures/el-metapkg-deps.txtscripts/test-fixtures/el-packages.txtscripts/test-fixtures/excluded-test.txtscripts/test-fixtures/expected-output.json
| while IFS= read -r line; do | ||
| line="${line%%#*}" | ||
| line="${line// /}" | ||
| [[ -z "$line" ]] && continue | ||
| EXCLUSION_PATTERNS+=("$line") | ||
| done < "$file" |
There was a problem hiding this comment.
load_exclusions strips all internal spaces and leaves tabs untouched.
line="${line// /}" removes every space character anywhere in the pattern (not just leading/trailing) and does nothing about tabs. Any tab-indented entry in excluded-packages.txt keeps its leading \t, so is_excluded will never match (pgedge-radar ≠ \tpgedge-radar*). Use a proper trim that handles spaces and tabs on both ends only.
🛠 Proposed fix
while IFS= read -r line; do
line="${line%%#*}"
- line="${line// /}"
+ # Trim leading/trailing whitespace (spaces and tabs) only.
+ line="${line#"${line%%[![:space:]]*}"}"
+ line="${line%"${line##*[![:space:]]}"}"
[[ -z "$line" ]] && continue
EXCLUSION_PATTERNS+=("$line")
done < "$file"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| while IFS= read -r line; do | |
| line="${line%%#*}" | |
| line="${line// /}" | |
| [[ -z "$line" ]] && continue | |
| EXCLUSION_PATTERNS+=("$line") | |
| done < "$file" | |
| while IFS= read -r line; do | |
| line="${line%%#*}" | |
| # Trim leading/trailing whitespace (spaces and tabs) only. | |
| line="${line#"${line%%[![:space:]]*}"}" | |
| line="${line%"${line##*[![:space:]]}"}" | |
| [[ -z "$line" ]] && continue | |
| EXCLUSION_PATTERNS+=("$line") | |
| done < "$file" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/discover-repo-packages.sh` around lines 53 - 58, The load_exclusions
loop currently uses line="${line// /}" which removes all internal spaces and
ignores tabs, breaking patterns with leading tabs; change the trimming logic in
load_exclusions so you only strip leading and trailing whitespace (spaces and
tabs) and preserve internal spaces when populating EXCLUSION_PATTERNS. Replace
the single global-space removal with a two-step trim using POSIX parameter
expansion (remove leading whitespace, then trailing whitespace) before the
empty-check and appending to EXCLUSION_PATTERNS; keep the existing
comment-stripping step (line="${line%%#*}") and the rest of the loop unchanged
so is_excluded can match correctly.
| pkg_in_metapkg() { | ||
| local pkg="$1" | ||
| local -n arr="$2" | ||
| for dep in "${arr[@]}"; do | ||
| if [[ "$dep" == "$pkg" || "$dep" == *"$pkg"* ]]; then | ||
| return 0 | ||
| fi | ||
| done | ||
| return 1 | ||
| } |
There was a problem hiding this comment.
Substring fallback in pkg_in_metapkg produces false positives.
[[ "$dep" == *"$pkg"* ]] matches any dependency whose name contains $pkg as a substring — e.g. looking up pgedge-spock will match pgedge-spock50_18, and looking up pgedge-pg will match nearly everything. The same loose check is reused in interactive_walkthrough (lines 463–476) for in_full/in_minimal classification, so newly discovered packages can be silently mis-classified as included in full/minimal.
Consider matching exact tokens, or constrain the substring check to a well-defined prefix/suffix (e.g. "$pkg"_* for versioned variants).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/discover-repo-packages.sh` around lines 202 - 211, The pkg_in_metapkg
function (and the same loose check used in interactive_walkthrough for
in_full/in_minimal classification) currently treats any dependency containing
$pkg as a match, causing false positives; change the matching logic to only
accept exact tokens or clearly delimited variants (e.g., match "$dep" == "$pkg"
or "$dep" == "${pkg}_*" or split dep tokens and compare each token exactly) so
versioned or suffixed names like pkg_... are matched intentionally and not by
arbitrary substring containment; update pkg_in_metapkg and the corresponding
checks in interactive_walkthrough to use this stricter comparison.
Summary
catalog.jsonand reports NEW, MATCH, MISSING, EXCLUDEDUsage
Test plan
./scripts/discover-repo-packages.sh --self-test— all 5 checks pass./scripts/discover-repo-packages.sh --discover-only— exits 0, shows MATCH/EXCLUDED with no unexpected NEW./scripts/discover-repo-packages.sh --discover-only --verbose— excluded packages listed individuallyscripts/output/is gitignored and not committedSummary by CodeRabbit