Skip to content

[rust-guard] Rust Guard: Eliminate two avoidable heap allocations in hot paths #5667

@github-actions

Description

@github-actions

🦀 Rust Guard Improvement Report

Improvement 1: Eliminate Vec<&str> in check_file_secrecy

Category: Performance
File(s): guards/github-guard/rust-guard/src/labels/tool_rules.rs
Effort: Small (< 15 min)
Risk: Low

Problem

check_file_secrecy allocates a Vec<&str> via .collect() on every get_file_contents call, even though segments is only used for two iterator operations that can be inlined:

  1. segments.iter().any(|seg| seg.starts_with(*pattern)) → can use path_lower.split('/').any(...)
  2. segments.last().copied().unwrap_or(path_lower.as_str()) → can use path_lower.rsplit('/').next().unwrap_or(&path_lower)

The unwrap_or fallback is also unreachable dead code: str::split always yields at least one element, so last() is always Some.

Before

fn check_file_secrecy(path: &str, ...) -> Vec<String> {
    let path_lower = path.to_lowercase();
    let segments: Vec<&str> = path_lower.split('/').collect();   // ← heap alloc

    for pattern in SENSITIVE_FILE_PATTERNS {
        if path_lower.ends_with(pattern) || segments.iter().any(|seg| seg.starts_with(*pattern)) {
            return policy_private_scope_label(owner, repo, repo_id, ctx);
        }
    }

    let filename = segments.last().copied().unwrap_or(path_lower.as_str()); // ← unwrap_or unreachable
    for keyword in SENSITIVE_FILE_KEYWORDS {
        if filename.contains(keyword) { ... }
    }
    ...
}

After

fn check_file_secrecy(path: &str, ...) -> Vec<String> {
    let path_lower = path.to_lowercase();

    for pattern in SENSITIVE_FILE_PATTERNS {
        if path_lower.ends_with(pattern) || path_lower.split('/').any(|seg| seg.starts_with(*pattern)) {
            return policy_private_scope_label(owner, repo, repo_id, ctx);
        }
    }

    let filename = path_lower.rsplit('/').next().unwrap_or(&path_lower);
    for keyword in SENSITIVE_FILE_KEYWORDS {
        if filename.contains(keyword) { ... }
    }
    ...
}

Why This Matters

Every get_file_contents tool call flows through check_file_secrecy. The Vec<&str> allocation is unnecessary; iterating the split lazily is semantically identical. Removing it also eliminates the misleading dead-code unwrap_or that implies split can return an empty iterator.


Improvement 2: Return Cow<'_, str> from infer_scope_for_baseline

Category: Performance
File(s): guards/github-guard/rust-guard/src/lib.rs
Effort: Small (< 15 min)
Risk: Low

Problem

infer_scope_for_baseline currently returns String. Its most common branch — !repo_id.is_empty() — does repo_id.to_string(), which clones the already-owned String from extract_repo_info. The function is called twice per label_resource / label_response dispatch (lines 669 and 869). Using Cow<'_, str> lets the common path borrow repo_id instead of cloning it; only the search_* query-extraction branch needs an owned String.

Before

fn infer_scope_for_baseline(tool_name: &str, tool_args: &Value, repo_id: &str) -> String {
    if !repo_id.is_empty() {
        return repo_id.to_string();  // ← clones repo_id every call
    }
    match tool_name {
        "dismiss_notification" | ... => scope_names::GITHUB.to_string(),  // ← alloc
        "search_code" | ... => {
            let (_, _, repo_from_query) = extract_repo_info_from_search_query(query);
            repo_from_query  // ← already owned, fine
        }
        _ => String::new(),  // ← alloc
    }
}

After

use std::borrow::Cow;

fn infer_scope_for_baseline<'a>(tool_name: &str, tool_args: &Value, repo_id: &'a str) -> Cow<'a, str> {
    if !repo_id.is_empty() {
        return Cow::Borrowed(repo_id);  // ← zero-cost borrow
    }
    match tool_name {
        "dismiss_notification" | ... => Cow::Borrowed(scope_names::GITHUB),
        "search_code" | ... => {
            let (_, _, repo_from_query) = extract_repo_info_from_search_query(query);
            Cow::Owned(repo_from_query)
        }
        _ => Cow::Borrowed(""),
    }
}

Call sites already take &baseline_scope / &*baseline_scope, which work transparently with Cow<str> via its Deref<Target = str> impl — no changes needed at the call sites.

Why This Matters

infer_scope_for_baseline is called on every label_resource and label_response invocation. The repo_id.to_string() clone is pure overhead — the result is immediately used as a &str reference. The Cow return type expresses the ownership accurately and eliminates one String heap allocation in the common case.


Codebase Health Summary

  • Total Rust files: 10
  • Total lines: 13,775
  • Areas analyzed: lib.rs, tool_rules.rs, tools.rs, backend.rs, helpers.rs, mod.rs, response_paths.rs, response_items.rs, constants.rs
  • Areas with no further improvements: tools.rs (well-tested, constants already extracted), constants.rs (clean)

Generated by Rust Guard Improver • Run: §25853609445

Generated by Rust Guard Improver · ● 1.3M ·

  • expires on May 21, 2026, 9:58 AM UTC

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions