Conversation
- Add tree-sitter-c-sharp dependency to pyproject.toml - Create CSharpAnalyzer in api/analyzers/csharp/ supporting: classes, interfaces, enums, structs, methods, constructors - Register .cs extension in source_analyzer.py with LSP config - Add C# test source file and unit test - Update README to list C# as supported language Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughAdds comprehensive C# support: new Tree-sitter–based Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant SourceAnalyzer as SourceAnalyzer
participant CSharpAnalyzer as CSharpAnalyzer
participant TreeSitter as TreeSitter
participant LanguageServer as LanguageServer
participant Graph as Graph
User->>SourceAnalyzer: analyze_sources(path)
SourceAnalyzer->>SourceAnalyzer: collect *.cs files
SourceAnalyzer->>LanguageServer: init/start C# LS (MultilspyConfig)
SourceAnalyzer->>CSharpAnalyzer: analyze(file)
CSharpAnalyzer->>TreeSitter: parse file -> AST
TreeSitter-->>CSharpAnalyzer: AST nodes
CSharpAnalyzer->>CSharpAnalyzer: extract entities (types, methods, docs)
CSharpAnalyzer->>LanguageServer: resolve_type / resolve_method
LanguageServer-->>CSharpAnalyzer: resolved symbols
CSharpAnalyzer->>SourceAnalyzer: enriched entities
SourceAnalyzer->>Graph: create nodes & relationships
Graph-->>User: analysis results available
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| else: | ||
| lsps[".py"] = NullLanguageServer() | ||
| with lsps[".java"].start_server(), lsps[".py"].start_server(): | ||
| if any(path.rglob('*.cs')): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
In general, the fix is to constrain and validate user‑supplied paths before using them to access the filesystem. A standard pattern is to define a trusted base directory (root for all analyses), normalize the user path with Path.resolve() (or os.path.realpath) and then check that the resulting absolute path is inside this base directory. If the check fails, reject the request. This prevents directory traversal and arbitrary path access while preserving existing functionality for legitimate subdirectories.
For this codebase, the best minimal fix (without changing the core analyzer logic) is:
-
In the HTTP handlers where a
pathis received (e.g.,analyze_folderintests/index.pyand any similar route inapi/index.pyif they exist), normalize the path to an absolutePath, and ensure it lies within a configured analysis root directory. The root could be taken from an environment variable such asCODE_GRAPH_ROOTor default to the current working directory if not set. -
After validation, pass only the resolved, validated path string to
SourceAnalyzer.analyze_local_folder. That way, insideSourceAnalyzer, all further use ofpathultimately derives from a vetted path. -
Optionally, add a small helper function in the entrypoint file (
tests/index.py) to perform this validation, re‑using it for any other endpoints that may accept paths.
We do not need to change api/analyzers/source_analyzer.py itself, because it already calls path.resolve() in analyze_sources, and the primary issue is that its initial path argument may refer to an arbitrary location; constraining that at the boundary (the HTTP layer) is sufficient.
Concretely, in tests/index.py around the analyze_folder endpoint, we will:
-
Import
Pathfrompathlibis already present. -
Import
osis already present. -
Add logic after reading
pathfrom the request:-
Get a base directory from an environment variable, e.g.
ANALYSIS_ROOT = os.environ.get("CODE_GRAPH_ROOT", os.getcwd())near the top of the file. -
Resolve both the base and the requested path:
base_path = Path(ANALYSIS_ROOT).resolve(),requested_path = Path(path).resolve(). -
Check
requested_path.is_dir()instead ofos.path.isdir(path). -
Ensure
requested_pathis underbase_path, e.g.:try: requested_path.relative_to(base_path) except ValueError: # not under base
-
If any of these checks fail, return a 400 error.
-
-
Pass
str(requested_path)(orrequested_pathconverted to string) intoanalyzer.analyze_local_folder.
No new third‑party dependencies are required; we only use Python’s standard library (os, pathlib).
| @@ -17,6 +17,9 @@ | ||
| format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Base directory for analyzing local source code; defaults to current working directory. | ||
| ANALYSIS_ROOT = Path(os.environ.get("CODE_GRAPH_ROOT", os.getcwd())).resolve() | ||
|
|
||
| # Function to verify the token | ||
| SECRET_TOKEN = os.getenv('SECRET_TOKEN') | ||
| def verify_token(token): | ||
| @@ -346,24 +349,37 @@ | ||
| logging.error("'path' is missing from the request.") | ||
| return jsonify({"status": "'path' is required."}), 400 | ||
|
|
||
| try: | ||
| requested_path = Path(path).resolve() | ||
| except (OSError, RuntimeError) as e: | ||
| logging.error(f"Failed to resolve path '{path}': {e}") | ||
| return jsonify({"status": "Invalid path: could not be resolved"}), 400 | ||
|
|
||
| # Validate path exists and is a directory | ||
| if not os.path.isdir(path): | ||
| logging.error(f"Path '{path}' does not exist or is not a directory") | ||
| if not requested_path.is_dir(): | ||
| logging.error(f"Path '{requested_path}' does not exist or is not a directory") | ||
| return jsonify({"status": "Invalid path: must be an existing directory"}), 400 | ||
|
|
||
| # Ensure the requested path is within the configured analysis root | ||
| try: | ||
| requested_path.relative_to(ANALYSIS_ROOT) | ||
| except ValueError: | ||
| logging.error(f"Path '{requested_path}' is outside of the allowed analysis root '{ANALYSIS_ROOT}'") | ||
| return jsonify({"status": "Invalid path: outside of allowed analysis root"}), 400 | ||
|
|
||
| # Validate ignore list contains valid paths | ||
| if not isinstance(ignore, list): | ||
| logging.error("'ignore' must be a list of paths") | ||
| return jsonify({"status": "'ignore' must be a list of paths"}), 400 | ||
|
|
||
| proj_name = Path(path).name | ||
| proj_name = requested_path.name | ||
|
|
||
| # Initialize the graph with the provided project name | ||
| g = Graph(proj_name) | ||
|
|
||
| # Analyze source code within given folder | ||
| analyzer = SourceAnalyzer() | ||
| analyzer.analyze_local_folder(path, g, ignore) | ||
| analyzer.analyze_local_folder(str(requested_path), g, ignore) | ||
|
|
||
| # Return response | ||
| response = { |
|
|
||
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
In general, the fix is to constrain user-controlled paths to a trusted root and validate them before using them to access the filesystem. The common pattern: define a base directory (e.g., a workspace root or repositories directory), normalize both the base and the requested path, ensure the requested path is within the base (e.g., via Path.is_relative_to or an equivalent), and only then perform operations such as rglob. If validation fails, the code should raise an error or return an HTTP 400/403.
In this codebase, the minimal, self-contained fix is to harden SourceAnalyzer.analyze_local_folder, which is the point where untrusted path: str enters the analyzer domain. We can normalize the user path with Path(path).resolve(), and then check that it is inside a configured allowed root directory. Since we must not change imports outside the snippet and should avoid assuming additional configuration wiring, a pragmatic approach is:
- Define an allowed base directory from an environment variable (e.g.,
CODE_GRAPH_ALLOWED_ROOT), defaulting to the current working directory (Path.cwd()), which is a safe and reasonable default. - Resolve both the base directory and the requested directory.
- Check that
resolved_pathis underbase_dirusingPath.is_relative_towhen available, or arelative_totry/except fallback for older Python versions. - If the check fails, log an error and raise
ValueError(which the caller can surface as an error response). - Pass the validated
resolved_pathintoanalyze_sourcesinstead of the rawpathstring.
This keeps existing functionality intact for callers that already provide paths within the working directory (or configured root), addresses all CodeQL variants that involve path.rglob, and centralizes path validation in one method.
Concretely, in api/analyzers/source_analyzer.py:
- Add an
import os(a well-known standard library) to read an environment variable. - In
analyze_local_folder, before logging and callinganalyze_sources, computebase_dir = Path(os.environ.get("CODE_GRAPH_ALLOWED_ROOT", Path.cwd())),resolved_base = base_dir.resolve(),resolved_path = Path(path).resolve(), and enforce thatresolved_pathis withinresolved_base. If not, log and raise. - Call
self.analyze_sources(resolved_path, ignore, g)instead ofPath(path).
No changes are needed in analyze_sources itself, since it already calls path.resolve(); the new validation guarantees that path there is safe.
| @@ -1,6 +1,7 @@ | ||
| from contextlib import nullcontext | ||
| from pathlib import Path | ||
| from typing import Optional | ||
| import os | ||
|
|
||
| from api.entities.entity import Entity | ||
| from api.entities.file import File | ||
| @@ -190,10 +191,38 @@ | ||
| ignore (List(str)): List of paths to skip | ||
| """ | ||
|
|
||
| logging.info(f"Analyzing local folder {path}") | ||
| # Determine the allowed root directory for analysis. | ||
| # Defaults to the current working directory if CODE_GRAPH_ALLOWED_ROOT is not set. | ||
| allowed_root_env = os.environ.get("CODE_GRAPH_ALLOWED_ROOT") | ||
| if allowed_root_env: | ||
| base_dir = Path(allowed_root_env) | ||
| else: | ||
| base_dir = Path.cwd() | ||
|
|
||
| resolved_base = base_dir.resolve() | ||
| resolved_path = Path(path).resolve() | ||
|
|
||
| # Ensure that the requested path is within the allowed root directory. | ||
| try: | ||
| # Python 3.9+: use is_relative_to if available | ||
| is_relative = getattr(resolved_path, "is_relative_to", None) | ||
| if callable(is_relative): | ||
| allowed = resolved_path.is_relative_to(resolved_base) | ||
| else: | ||
| # Fallback for older Python: use relative_to in a try/except. | ||
| resolved_path.relative_to(resolved_base) | ||
| allowed = True | ||
| except ValueError: | ||
| allowed = False | ||
|
|
||
| if not allowed: | ||
| logging.error(f"Requested path '{resolved_path}' is outside the allowed root '{resolved_base}'") | ||
| raise ValueError("Requested path is outside the allowed analysis root") | ||
|
|
||
| logging.info(f"Analyzing local folder {resolved_path}") | ||
|
|
||
| # Analyze source files | ||
| self.analyze_sources(Path(path), ignore, g) | ||
| self.analyze_sources(resolved_path, ignore, g) | ||
|
|
||
| logging.info("Done analyzing path") | ||
|
|
|
|
||
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
General approach: treat the incoming path as untrusted and restrict directory traversal to a configured “safe root” on the server. Normalize the user-supplied path relative to that root, ensure it stays inside the root after normalization, and then operate only on the safe resolved path. This mitigates directory traversal and arbitrary filesystem access while preserving functionality for intended directories within the allowed root.
Concrete fix in this codebase:
- In
SourceAnalyzer.analyze_local_folder, convert the incomingpathstring to aPath, resolve it, and ensure it is within an allowed base directory. To avoid assumptions about the environment, a conservative default is to use the current working directory (Path.cwd()) as the root for allowed analysis (you can later tighten this by making it configurable). UsePath.resolve()andPath.is_relative_to()(Python 3.9+) to perform this check safely. - If the provided directory is outside the allowed root, log an error and either raise an exception or return early. Since this is library‑style code used from the Flask handlers, raising a
ValueErroris reasonable; the caller can catch it and turn it into an HTTP error if desired. - Use the validated, resolved path when calling
analyze_sourcesinstead of the raw string.
These changes are all in api/analyzers/source_analyzer.py, around the analyze_local_folder method, plus a small helper to support Python versions without Path.is_relative_to (if needed). We do not need to modify imports; we already have Path and logging.
| @@ -190,11 +190,33 @@ | ||
| ignore (List(str)): List of paths to skip | ||
| """ | ||
|
|
||
| logging.info(f"Analyzing local folder {path}") | ||
| # Define a safe root directory for analysis (current working directory by default) | ||
| safe_root = Path.cwd().resolve() | ||
|
|
||
| # Analyze source files | ||
| self.analyze_sources(Path(path), ignore, g) | ||
| # Resolve the user-provided path | ||
| requested_path = Path(path).resolve() | ||
|
|
||
| # Ensure the requested path is within the safe root directory | ||
| try: | ||
| # Python 3.9+ provides Path.is_relative_to | ||
| is_within_root = requested_path.is_relative_to(safe_root) | ||
| except AttributeError: | ||
| # Fallback for Python versions without is_relative_to | ||
| try: | ||
| requested_path.relative_to(safe_root) | ||
| is_within_root = True | ||
| except ValueError: | ||
| is_within_root = False | ||
|
|
||
| if not is_within_root: | ||
| logging.error(f"Refusing to analyze path outside of safe root. Requested: {requested_path}, Safe root: {safe_root}") | ||
| raise ValueError("Invalid path: outside of allowed analysis root") | ||
|
|
||
| logging.info(f"Analyzing local folder {requested_path}") | ||
|
|
||
| # Analyze source files using the validated, resolved path | ||
| self.analyze_sources(requested_path, ignore, g) | ||
|
|
||
| logging.info("Done analyzing path") | ||
|
|
||
| def analyze_local_repository(self, path: str, ignore: Optional[list[str]] = None) -> Graph: |
|
|
||
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
At a high level, the fix is to constrain and validate any user-provided filesystem path before using it with Path.resolve() or Path.rglob, by ensuring it is contained within a configured safe root directory. This typically involves: (1) defining a base directory (e.g., from configuration or environment), (2) resolving both the base and the requested path, and (3) verifying that the resolved requested path is a descendant of the base. If the check fails, the function should abort with an error instead of proceeding.
In this codebase, the main data-flow issue occurs where SourceAnalyzer.analyze_local_folder accepts an arbitrary path: str and passes Path(path) into analyze_sources, which calls path.resolve() and uses rglob. To fix this without changing existing functionality too much, we can normalize and validate the path inside analyze_local_folder before any filesystem operations: compute safe_root = Path.cwd().resolve() (or a more appropriate root if available) and target_path = Path(path).resolve(), then check that safe_root is a parent of target_path (e.g., target_path.is_relative_to(safe_root) in Python 3.9+, or a prefix comparison using relative_to). If the check fails, log an error and raise an exception, causing the caller (Flask endpoint) to return an error response instead of walking that path. We should then pass target_path (already resolved) into analyze_sources, which still calls path.resolve() but now on a value we’ve validated. Similarly, analyze_local_repository should validate its path argument in the same way before calling analyze_local_folder and Repository(path). Since we must not change imports beyond standard library, we’ll only rely on pathlib.Path and logging, both already imported.
Concretely:
- In
SourceAnalyzer.analyze_local_folder, replace the direct use ofPath(path)with:- Resolve and validate
pathagainst a safe root (here we’ll use the current working directory as the base; that’s a safe default without additional project context). - Log and raise
ValueErrorifpathis outside this root. - Pass the validated
Pathtoanalyze_sources.
- Resolve and validate
- In
SourceAnalyzer.analyze_local_repository, perform the same resolution/validation before callinganalyze_local_folderand before constructingRepository(path). Use the resolved safe path string when opening the repository. - No new helper methods or imports are strictly necessary; we can implement the logic inline using
Path.resolveandPath.relative_towith atry/exceptto simulateis_relative_tofor broader compatibility.
This confines all subsequent filesystem traversals and repository access to a single safe root, addressing all alert variants that go through analyze_local_folder/analyze_sources and analyze_local_repository.
| @@ -192,9 +192,19 @@ | ||
|
|
||
| logging.info(f"Analyzing local folder {path}") | ||
|
|
||
| # Analyze source files | ||
| self.analyze_sources(Path(path), ignore, g) | ||
| # Normalize and validate the provided path against a safe root (current working directory) | ||
| safe_root = Path.cwd().resolve() | ||
| target_path = Path(path).resolve() | ||
| try: | ||
| # This will raise a ValueError if target_path is not within safe_root | ||
| target_path.relative_to(safe_root) | ||
| except ValueError: | ||
| logging.error(f"Refusing to analyze path outside of safe root. safe_root={safe_root}, requested={target_path}") | ||
| raise ValueError("Invalid path: must be within the server's working directory") | ||
|
|
||
| # Analyze source files using the validated, normalized path | ||
| self.analyze_sources(target_path, ignore, g) | ||
|
|
||
| logging.info("Done analyzing path") | ||
|
|
||
| def analyze_local_repository(self, path: str, ignore: Optional[list[str]] = None) -> Graph: | ||
| @@ -210,10 +219,19 @@ | ||
| """ | ||
| from pygit2.repository import Repository | ||
|
|
||
| self.analyze_local_folder(path, ignore) | ||
| # Normalize and validate repository path against the same safe root | ||
| safe_root = Path.cwd().resolve() | ||
| repo_path = Path(path).resolve() | ||
| try: | ||
| repo_path.relative_to(safe_root) | ||
| except ValueError: | ||
| logging.error(f"Refusing to analyze repository outside of safe root. safe_root={safe_root}, requested={repo_path}") | ||
| raise ValueError("Invalid repository path: must be within the server's working directory") | ||
|
|
||
| self.analyze_local_folder(str(repo_path), ignore) | ||
|
|
||
| # Save processed commit hash to the DB | ||
| repo = Repository(path) | ||
| repo = Repository(str(repo_path)) | ||
| head = repo.commit("HEAD") | ||
| self.graph.set_graph_commit(head.short_id) | ||
|
|
- Remove unused 'import os' from csharp/analyzer.py - Replace 'from ...entities import *' with explicit imports - Remove unused 'from pathlib import Path' from test_csharp_analyzer.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
api/analyzers/csharp/analyzer.py (1)
114-122:elsebranch inresolve_methodis a no-op — dead code.When
func_nodeis neithermember_access_expressionnoridentifier, theelsebranch reassignsfunc_nodetonode.child_by_field_name('function')— the exact valuefunc_nodewas just set to on line 114. The assignment is redundant and the branch can be removed.♻️ Proposed fix
func_node = node.child_by_field_name('function') if func_node and func_node.type == 'member_access_expression': func_node = func_node.child_by_field_name('name') elif func_node and func_node.type == 'identifier': pass - else: - func_node = node.child_by_field_name('function')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/csharp/analyzer.py` around lines 114 - 122, In resolve_method, the else branch that reassigns func_node to node.child_by_field_name('function') is redundant because func_node was already set to that; remove that else branch and its assignment so the logic becomes: get func_node = node.child_by_field_name('function'), if its type is 'member_access_expression' replace it with its child 'name', elif it's 'identifier' leave it, otherwise do nothing; retain the subsequent "if func_node: node = func_node" behavior (references: resolve_method, func_node, node.child_by_field_name('function'), member_access_expression, identifier).tests/test_csharp_analyzer.py (2)
52-60: Consider asserting theIMPLEMENTSrelationship betweenConsoleLoggerandILogger.The test covers
DEFINESand entity detection but omits verifying the coreIMPLEMENTS/EXTENDSrelationship resolution — the primary value ofadd_symbolsandresolve_symbol. A CI-green test that doesn't cover this could mask LSP resolution failures silently.✅ Suggested addition
# Verify ConsoleLogger implements ILogger q = "MATCH (c:Class {name: 'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'}) RETURN c, i LIMIT 1" res = g._query(q).result_set self.assertEqual(len(res), 1, "ConsoleLogger should implement ILogger")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_csharp_analyzer.py` around lines 52 - 60, Add an assertion that verifies the IMPLEMENTS relationship between ConsoleLogger and ILogger: run a Cypher query similar to the existing checks (e.g., use g._query with "MATCH (c:Class {name: 'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'}) RETURN c, i LIMIT 1") and assert the query returns the expected row (e.g., assert len(result_set) == 1 or that result_set[0] exists) so the test ensures resolve_symbol/add_symbols correctly records the implements relationship for ConsoleLogger -> ILogger.
8-24: Add teardown to clean up the"csharp"graph after the test.
Graph("csharp")is created but never cleaned up. If the underlying FalkorDB instance persists data across runs, a subsequent test execution will find pre-populated entities and the assertions may pass vacuously even if the analyzer is broken.♻️ Suggested fix
class Test_CSharp_Analyzer(unittest.TestCase): + def tearDown(self): + try: + Graph("csharp").delete() + except Exception: + pass + def test_analyzer(self):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_csharp_analyzer.py` around lines 8 - 24, Add a tearDown to Test_CSharp_Analyzer that cleans up the "csharp" graph after the test: implement def tearDown(self): g = Graph("csharp"); call the Graph cleanup method provided by your Graph API (e.g., g.delete(), g.clear(), or g.delete_graph()) to remove all entities so the graph is empty between test runs; ensure the cleanup runs regardless of test outcome so analyzer.analyze_local_folder(path, g) does not leave persistent state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Line 6: Replace the wildcard import in analyzer.py with explicit imports of
the entities you actually use (e.g., Entity, File, Symbol, Reference, Location —
list the exact class/variable names referenced in this module) to resolve Ruff
F403/F405; update the top import line from "from ...entities import *" to "from
...entities import Entity, File, Symbol, ..." matching the symbols used by
functions/methods in this file (refer to usages in analyzer.py such as any
constructor calls or type hints that reference Entity/File/Symbol) so static
analyzers can verify definitions.
- Line 50: Guard the call to node.child_by_field_name('name') before accessing
.text.decode('utf-8'): capture the result (e.g., name_node =
node.child_by_field_name('name')), return a safe default (None or empty string
consistent with the module) if name_node is None, otherwise decode
name_node.text.decode('utf-8'); update the site where the expression appears to
avoid AttributeError and keep behavior consistent with other analyzers.
- Around line 27-30: The current condition uses Path.glob(...) which returns
generators that are always truthy, causing subprocess.run(["dotnet", "restore"],
...) to run unconditionally; change the guard to actually test for matches (for
example use any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln")) or
wrap the glob in list(...) and check truthiness) so dotnet restore only runs
when a .csproj or .sln is present; keep the existing temp_deps_cs directory
check and call to subprocess.run unchanged except for using the corrected
condition.
- Line 131: Update the return type annotation of resolve_symbol from "-> Entity"
to "-> list[Entity]" in api/analyzers/csharp/analyzer.py (the resolve_symbol
method) to match resolve_type and resolve_method and the usage in
Entity.resolved_symbol; also update the corresponding abstract declaration
AbstractAnalyzer.resolve_symbol to return list[Entity] so the signatures align
and mypy passes.
In `@api/analyzers/source_analyzer.py`:
- Around line 141-145: The rglob call uses the user-controlled variable path
directly, allowing path traversal; before any use of path.rglob (including the
C# block that creates SyncLanguageServer via SyncLanguageServer.create and
MultilspyConfig.from_dict), resolve and validate the incoming path in the entry
routines (e.g., analyze_local_folder/analyze_sources): convert to a Path and
call .resolve(), then ensure the resolved path is within an allowed base/root
(reject or normalize paths that escape the root, e.g., via ..), and only then
pass the safe resolved path to path.rglob and to SyncLanguageServer.create;
apply the same validation for the Java/Python blocks that also use path.rglob.
---
Duplicate comments:
In `@api/analyzers/source_analyzer.py`:
- Line 176: The rglob calls in analyze_sources (files =
list(path.rglob("*.java")) + list(path.rglob("*.py")) +
list(path.rglob("*.cs"))) use an unvalidated Path variable and trigger path
traversal findings; resolve and validate the incoming path once (e.g., call path
= path.resolve(strict=False) and ensure it is under an allowed root or repo root
/ whitelist) at the upstream site where path is created (the earlier code around
the path derivation) and then use the validated/resolved Path for the rglob
calls (resolved_path.rglob("*.java"), etc.) so all three rglob usages are fixed
together.
---
Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 114-122: In resolve_method, the else branch that reassigns
func_node to node.child_by_field_name('function') is redundant because func_node
was already set to that; remove that else branch and its assignment so the logic
becomes: get func_node = node.child_by_field_name('function'), if its type is
'member_access_expression' replace it with its child 'name', elif it's
'identifier' leave it, otherwise do nothing; retain the subsequent "if
func_node: node = func_node" behavior (references: resolve_method, func_node,
node.child_by_field_name('function'), member_access_expression, identifier).
In `@tests/test_csharp_analyzer.py`:
- Around line 52-60: Add an assertion that verifies the IMPLEMENTS relationship
between ConsoleLogger and ILogger: run a Cypher query similar to the existing
checks (e.g., use g._query with "MATCH (c:Class {name:
'ConsoleLogger'})-[:IMPLEMENTS]->(i:Interface {name: 'ILogger'}) RETURN c, i
LIMIT 1") and assert the query returns the expected row (e.g., assert
len(result_set) == 1 or that result_set[0] exists) so the test ensures
resolve_symbol/add_symbols correctly records the implements relationship for
ConsoleLogger -> ILogger.
- Around line 8-24: Add a tearDown to Test_CSharp_Analyzer that cleans up the
"csharp" graph after the test: implement def tearDown(self): g =
Graph("csharp"); call the Graph cleanup method provided by your Graph API (e.g.,
g.delete(), g.clear(), or g.delete_graph()) to remove all entities so the graph
is empty between test runs; ensure the cleanup runs regardless of test outcome
so analyzer.analyze_local_folder(path, g) does not leave persistent state.
api/analyzers/csharp/analyzer.py
Outdated
| from pathlib import Path | ||
|
|
||
| from multilspy import SyncLanguageServer | ||
| from ...entities import * |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Replace star import with explicit imports.
from ...entities import * prevents static analysis from verifying that Entity, File, Symbol, etc. are actually defined — exactly what Ruff F403/F405 is reporting. Other analyzers in the project use the same pattern, but this is the right place to break the habit for a new file.
♻️ Proposed fix
-from ...entities import *
+from ...entities.entity import Entity
+from ...entities.file import File🧰 Tools
🪛 Ruff (0.15.1)
[error] 6-6: from ...entities import * used; unable to detect undefined names
(F403)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/analyzers/csharp/analyzer.py` at line 6, Replace the wildcard import in
analyzer.py with explicit imports of the entities you actually use (e.g.,
Entity, File, Symbol, Reference, Location — list the exact class/variable names
referenced in this module) to resolve Ruff F403/F405; update the top import line
from "from ...entities import *" to "from ...entities import Entity, File,
Symbol, ..." matching the symbols used by functions/methods in this file (refer
to usages in analyzer.py such as any constructor calls or type hints that
reference Entity/File/Symbol) so static analyzers can verify definitions.
api/analyzers/csharp/analyzer.py
Outdated
| res.append(file.entities[method_dec]) | ||
| return res | ||
|
|
||
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: |
There was a problem hiding this comment.
Return type annotation -> Entity is incorrect; should be -> list[Entity].
Both resolve_type and resolve_method return list[Entity], and Entity.resolved_symbol (in entity.py) iterates the result with for resolved_symbol in f(key, symbol.symbol) — confirming the list contract. The annotation mismatch will cause mypy errors and mislead callers. The same fix is needed in the abstract base class AbstractAnalyzer.resolve_symbol.
🐛 Proposed fix
- def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity:
+ def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> Entity: | |
| def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, key: str, symbol: Node) -> list[Entity]: |
🧰 Tools
🪛 Ruff (0.15.1)
[error] 131-131: File may be undefined, or defined from star imports
(F405)
[error] 131-131: Entity may be undefined, or defined from star imports
(F405)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/analyzers/csharp/analyzer.py` at line 131, Update the return type
annotation of resolve_symbol from "-> Entity" to "-> list[Entity]" in
api/analyzers/csharp/analyzer.py (the resolve_symbol method) to match
resolve_type and resolve_method and the usage in Entity.resolved_symbol; also
update the corresponding abstract declaration AbstractAnalyzer.resolve_symbol to
return list[Entity] so the signatures align and mypy passes.
| if any(path.rglob('*.cs')): | ||
| config = MultilspyConfig.from_dict({"code_language": "csharp"}) | ||
| lsps[".cs"] = SyncLanguageServer.create(config, logger, str(path)) | ||
| else: | ||
| lsps[".cs"] = NullLanguageServer() |
There was a problem hiding this comment.
CodeQL: path traversal via user-controlled path in rglob.
path flows from the API request body into path.rglob('*.cs') without sanitization. A crafted path (e.g., containing ../) could cause the scanner to traverse outside the intended directory. CodeQL flagged this as a security failure. The pre-existing Java/Python blocks at lines 131 and 136 carry the same risk, but this new block adds another instance.
Resolve and validate path against an allowed base before using it in any rglob call. For example:
🔒 Suggested mitigation pattern
# In analyze_local_folder or analyze_sources, resolve and validate path early:
resolved = Path(path).resolve()
# Optionally assert it's within an allowed root🧰 Tools
🪛 GitHub Check: CodeQL
[failure] 141-141: Uncontrolled data used in path expression
This path depends on a user-provided value.
This path depends on a user-provided value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/analyzers/source_analyzer.py` around lines 141 - 145, The rglob call uses
the user-controlled variable path directly, allowing path traversal; before any
use of path.rglob (including the C# block that creates SyncLanguageServer via
SyncLanguageServer.create and MultilspyConfig.from_dict), resolve and validate
the incoming path in the entry routines (e.g.,
analyze_local_folder/analyze_sources): convert to a Path and call .resolve(),
then ensure the resolved path is within an allowed base/root (reject or
normalize paths that escape the root, e.g., via ..), and only then pass the safe
resolved path to path.rglob and to SyncLanguageServer.create; apply the same
validation for the Java/Python blocks that also use path.rglob.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
api/analyzers/csharp/analyzer.py (1)
113-122: Deadelsebranch is a no-op — remove it.The
elseblock on line 119 re-assignsfunc_node = node.child_by_field_name('function'), whichfunc_nodealready holds (set on line 114 and left unchanged by thepasson line 118). The branch is unreachable in any meaningful way.♻️ Proposed fix
func_node = node.child_by_field_name('function') if func_node and func_node.type == 'member_access_expression': func_node = func_node.child_by_field_name('name') - elif func_node and func_node.type == 'identifier': - pass - else: - func_node = node.child_by_field_name('function')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/csharp/analyzer.py` around lines 113 - 122, The code handling invocation_expression has a dead else that reassigns func_node to node.child_by_field_name('function') even though func_node was already set, so remove the redundant else branch: in the block where node.type == 'invocation_expression' keep the initial func_node = node.child_by_field_name('function'), handle the member_access_expression case by replacing func_node with its 'name' child and keep the identifier branch (or no-op), then drop the final else clause entirely so func_node is not redundantly reassigned before the if func_node: node = func_node assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 53-59: get_entity_docstring currently only returns
node.prev_sibling.text which captures only the last line of multi-line "///" XML
doc blocks; change get_entity_docstring to walk backwards from node.prev_sibling
while siblings are comment nodes and contiguous (e.g., loop using prev_sibling),
collect each comment.text.decode('utf-8') into a list, then reverse or prepend
so the original order is preserved and join with newlines to return the full
docstring; keep the existing type checks and the ValueError for unknown types,
and ensure it still returns None when no preceding comment siblings exist.
- Around line 66-77: The current heuristic in the class/struct handling loop in
analyzer.py labels the first base_list entry as "base_class" even when it might
be an interface; update the block around the loop (where base_list_captures is
iterated) to: add a clear inline comment documenting this known false-positive,
and before calling entity.add_symbol("base_class", base_type) call
self.resolve_type(base_type) (or equivalent resolver used in this analyzer) and
check the resolved node's kind (e.g., 'class_declaration' or
'struct_declaration') — only then add "base_class"; otherwise fall back to
add_symbol("implement_interface", base_type). Keep the existing first-flag
behavior but gate the base_class assignment on the resolved type to avoid
spurious base_class edges.
---
Duplicate comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 47-51: get_entity_name currently assumes
node.child_by_field_name('name') is non-null and calls .text, risking a None
dereference; update get_entity_name to fetch name_node =
node.child_by_field_name('name'), check if name_node is truthy before accessing
name_node.text.decode('utf-8'), and if it's None either raise a clearer
ValueError mentioning the parent node type and maybe node.start_point or return
a safe fallback (e.g., '') so callers of get_entity_name won't dereference None;
ensure the change touches only get_entity_name and preserves the existing node
type checks for 'class_declaration', 'interface_declaration',
'enum_declaration', 'struct_declaration', 'method_declaration', and
'constructor_declaration'.
- Line 131: The return type annotation for resolve_symbol is incorrect: change
the signature of resolve_symbol (in api/analyzers/csharp/analyzer.py) from "->
Entity" to "-> list[Entity]" so it matches resolve_type and resolve_method and
what Entity.resolved_symbol expects; update any import/type aliases if needed to
ensure list[Entity] is recognized by mypy.
- Around line 26-30: The guard in add_dependencies is ineffective and
Path.glob() is misused: change the early-return to check
path.joinpath("temp_deps_cs").is_dir(), only run dotnet restore when project
files exist by using any(path.glob("*.csproj")) or any(path.glob("*.sln"))
(instead of treating the generator as truthy), and after a successful
subprocess.run(..., cwd=str(path)) create the temp_deps_cs directory (e.g.,
path.joinpath("temp_deps_cs").mkdir()) so subsequent calls return early; adjust
references in add_dependencies to use the path parameter's methods (path.glob,
path.joinpath) rather than Path(f"{path}").
---
Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 113-122: The code handling invocation_expression has a dead else
that reassigns func_node to node.child_by_field_name('function') even though
func_node was already set, so remove the redundant else branch: in the block
where node.type == 'invocation_expression' keep the initial func_node =
node.child_by_field_name('function'), handle the member_access_expression case
by replacing func_node with its 'name' child and keep the identifier branch (or
no-op), then drop the final else clause entirely so func_node is not redundantly
reassigned before the if func_node: node = func_node assignment.
- Guard child_by_field_name('name') against None in get_entity_name
- Fix Path.glob() always-truthy bug in add_dependencies (use any())
- Fix resolve_symbol return type: -> list[Entity] in abstract class and all analyzers
- Remove redundant else branch in resolve_method
- Resolve path in analyze_sources to prevent traversal (path.resolve())
- Add setUp/tearDown to test class for graph cleanup
- Add IMPLEMENTS relationship assertion for ConsoleLogger -> ILogger
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds comprehensive C# language support to the code analysis tool, enabling it to parse and analyze C# source files alongside existing Python and Java support.
Changes:
- Added tree-sitter-c-sharp dependency (v0.23.1) for C# parsing
- Implemented CSharpAnalyzer supporting classes, interfaces, enums, structs, methods, and constructors with inheritance/implementation tracking
- Registered .cs file extension with LSP integration for C# in the source analyzer
- Added unit tests and test source file demonstrating C# analysis capabilities
- Updated documentation to reflect C# as a supported language
Reviewed changes
Copilot reviewed 6 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Added tree-sitter-c-sharp dependency version constraint |
| uv.lock | Auto-generated lock file updates for C# dependencies (tree-sitter-c-sharp, rdflib, pyparsing) |
| api/analyzers/csharp/init.py | Empty init file for C# analyzer package |
| api/analyzers/csharp/analyzer.py | Complete C# analyzer implementation with tree-sitter query support, entity extraction, and LSP integration |
| api/analyzers/source_analyzer.py | Registered CSharpAnalyzer for .cs extension and added C# LSP configuration |
| tests/test_csharp_analyzer.py | Unit test validating C# entity detection and relationship tracking |
| tests/source_files/csharp/Program.cs | Test source file demonstrating C# features (interfaces, classes, constructors, methods, inheritance) |
| README.md | Updated supported languages list to include C# |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
api/analyzers/csharp/analyzer.py
Outdated
| def add_dependencies(self, path: Path, files: list[Path]): | ||
| if Path(f"{path}/temp_deps_cs").is_dir(): | ||
| return | ||
| if Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln"): |
There was a problem hiding this comment.
The condition Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln") always evaluates to a truthy generator object, even when there are no matching files. This should use any() to check if files exist: any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln"))
| if Path(f"{path}").glob("*.csproj") or Path(f"{path}").glob("*.sln"): | |
| if any(Path(path).glob("*.csproj")) or any(Path(path).glob("*.sln")): |
api/analyzers/csharp/analyzer.py
Outdated
| elif func_node and func_node.type == 'identifier': | ||
| pass | ||
| else: | ||
| func_node = node.child_by_field_name('function') |
There was a problem hiding this comment.
In the resolve_method function, when node.type is 'invocation_expression', the code extracts func_node and then has a branch that does nothing (lines 117-118: elif func_node and func_node.type == 'identifier': pass). This empty branch followed by an else clause that reassigns func_node to the same value (line 120) makes the logic confusing. Consider simplifying: if func_node.type == 'member_access_expression', extract the name field; otherwise use the function node as-is.
| elif func_node and func_node.type == 'identifier': | |
| pass | |
| else: | |
| func_node = node.child_by_field_name('function') |
|
|
||
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) | ||
| path = path.resolve() |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
In general, to fix uncontrolled path usage you either (a) fully trust the caller and mark it as such (not applicable here), or (b) restrict paths to a known safe root / allow‑list and normalize before use. Here, the safest backward‑compatible fix is to constrain the analyzed folder to live under a configured root directory (for example, an environment variable like CODE_GRAPH_ROOT or, if unset, the current working directory). We’ll normalize both the root and the requested path using Path.resolve() and then ensure the requested path is inside the root via Path.is_relative_to (or equivalent). If it isn’t, we raise an error before any filesystem traversal.
Concretely:
- In
api/analyzers/source_analyzer.py, modifyanalyze_local_folderso that:- It converts the incoming
path: strto aPathinstance and resolves it. - It also determines a safe root directory (e.g.,
CODE_GRAPH_ROOTorPath.cwd()), resolves that, and checks that the requested path is within this root. - If the check fails or the path is not an existing directory, raise a
ValueError(or similar) instead of analyzing. - Pass the already‑resolved
Pathintoanalyze_sourcesto avoid re‑resolving a tainted string.
- It converts the incoming
- Adjust
analyze_sourcesso that it does not re‑resolve an already normalizedPath(this keeps behavior consistent and avoids double‑normalize of a string we now validate earlier). - The Flask layer (
tests/index.py) already validatesos.path.isdir(path); we can keep that, relying on the deeper validation insideSourceAnalyzer. No changes are required there for the security fix.
This requires only standard library functionality (os, pathlib), so no extra project dependencies are needed.
| @@ -17,6 +17,7 @@ | ||
| from multilspy.multilspy_logger import MultilspyLogger | ||
|
|
||
| import logging | ||
| import os | ||
| # Configure logging | ||
| logging.basicConfig(level=logging.DEBUG, format='%(filename)s - %(asctime)s - %(levelname)s - %(message)s') | ||
|
|
||
| @@ -173,7 +174,6 @@ | ||
| self.second_pass(graph, files, path) | ||
|
|
||
| def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None: | ||
| path = path.resolve() | ||
| files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) | ||
| # First pass analysis of the source code | ||
| self.first_pass(path, files, ignore, graph) | ||
| @@ -192,9 +192,33 @@ | ||
|
|
||
| logging.info(f"Analyzing local folder {path}") | ||
|
|
||
| # Analyze source files | ||
| self.analyze_sources(Path(path), ignore, g) | ||
| # Determine the allowed root directory for analysis. | ||
| # Prefer the CODE_GRAPH_ROOT environment variable if set; otherwise default to the current working directory. | ||
| root_env = os.environ.get("CODE_GRAPH_ROOT") | ||
| if root_env: | ||
| root = Path(root_env).resolve() | ||
| else: | ||
| root = Path.cwd().resolve() | ||
|
|
||
| # Resolve the requested path and ensure it is within the allowed root. | ||
| target_path = Path(path).resolve() | ||
|
|
||
| try: | ||
| # Python 3.9+ has Path.is_relative_to; fall back to a prefix check if unavailable. | ||
| is_within_root = target_path.is_relative_to(root) # type: ignore[attr-defined] | ||
| except AttributeError: | ||
| # Fallback for Python versions without is_relative_to | ||
| is_within_root = str(target_path).startswith(str(root) + os.sep) or target_path == root | ||
|
|
||
| if not is_within_root: | ||
| raise ValueError(f"Path '{target_path}' is outside of the allowed root '{root}'") | ||
|
|
||
| if not target_path.is_dir(): | ||
| raise ValueError(f"Path '{target_path}' does not exist or is not a directory") | ||
|
|
||
| # Analyze source files using the validated, resolved path | ||
| self.analyze_sources(target_path, ignore, g) | ||
|
|
||
| logging.info("Done analyzing path") | ||
|
|
||
| def analyze_local_repository(self, path: str, ignore: Optional[list[str]] = None) -> Graph: |
- Walk back through contiguous comment siblings to collect all XML doc comment lines instead of only the last one - Add clarifying comment about the base_list heuristic limitation (cannot distinguish base class from interface without semantic analysis) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
api/analyzers/csharp/analyzer.py (1)
20-24: Simplify_capturesto align with Java and Python analyzers.The CSharp analyzer currently uses
QueryCursor(query).captures(node)via the_captureshelper, while bothJavaAnalyzerandPythonAnalyzercallquery.captures(node)directly. These paths are functionally equivalent in current tree-sitter Python bindings. Consolidating to the direct pattern reduces maintenance surface and removes API-version sensitivity.Proposed simplification
- def _captures(self, pattern: str, node: Node) -> dict: - """Run a tree-sitter query and return captures dict.""" - query = self.language.query(pattern) - cursor = QueryCursor(query) - return cursor.captures(node) + def _captures(self, pattern: str, node: Node) -> dict: + """Run a tree-sitter query and return captures dict.""" + return self.language.query(pattern).captures(node)Remove
QueryCursorfrom the import on line 11, as it will no longer be used.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/analyzers/csharp/analyzer.py` around lines 20 - 24, The _captures helper in CSharpAnalyzer should call the query's captures directly like the Java/Python analyzers: replace the body of _captures in analyzer.py to return self.language.query(pattern).captures(node) (i.e., remove use of QueryCursor), and remove the unused QueryCursor import at the top; keep the method signature def _captures(self, pattern: str, node: Node) -> dict and ensure other callers to _captures continue to work unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 26-30: The add_dependencies function never creates the sentinel
directory and never populates the files list; modify add_dependencies to run
subprocess.run(["dotnet","restore"], cwd=str(path), check=True) only when a
.csproj or .sln exists, and after a successful restore create the sentinel
Path(f"{path}/temp_deps_cs").mkdir(parents=True, exist_ok=True) so the
idempotency guard can short-circuit future calls; also populate the files
argument (e.g., extend files with Path(path).glob("**/*.cs") or other C# source
patterns) so C# source files are queued for analysis and the files parameter is
actually used.
---
Duplicate comments:
In `@api/analyzers/source_analyzer.py`:
- Around line 141-145: analyze_files currently passes the user-supplied Path
directly into first_pass/second_pass causing second_pass's use of
path.rglob('*.cs') to be vulnerable; update analyze_files to resolve the
incoming path with path = path.resolve() before calling first_pass and
second_pass, and additionally validate the resolved path is inside an allowed
analysis root (e.g., compare against a configurable ANALYSIS_ROOT or raise/abort
if not a subpath) so that rglob cannot operate on arbitrary filesystem
locations; ensure references include analyze_files, first_pass, second_pass, and
path.resolve() in your changes.
---
Nitpick comments:
In `@api/analyzers/csharp/analyzer.py`:
- Around line 20-24: The _captures helper in CSharpAnalyzer should call the
query's captures directly like the Java/Python analyzers: replace the body of
_captures in analyzer.py to return self.language.query(pattern).captures(node)
(i.e., remove use of QueryCursor), and remove the unused QueryCursor import at
the top; keep the method signature def _captures(self, pattern: str, node: Node)
-> dict and ensure other callers to _captures continue to work unchanged.
| def add_dependencies(self, path: Path, files: list[Path]): | ||
| if Path(f"{path}/temp_deps_cs").is_dir(): | ||
| return | ||
| if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")): | ||
| subprocess.run(["dotnet", "restore"], cwd=str(path)) |
There was a problem hiding this comment.
add_dependencies idempotency guard never fires — dotnet restore runs on every invocation.
Path(f"{path}/temp_deps_cs").is_dir() is checked on line 27, but nothing in this function (or anywhere else) creates the temp_deps_cs directory. The guard is therefore permanently False and dotnet restore will run on every call. This contrasts with the Python analyzer (checks venv/ which it creates) and the Java analyzer (creates temp_deps/ before scanning it).
Additionally, unlike Java (files.extend(...)) and Python (files.extend(...)), files is never populated here, so no C# dependency source files are queued for analysis. Ruff also flags the files parameter as unused (ARG002).
🐛 Proposed fix — create the sentinel directory after a successful restore
def add_dependencies(self, path: Path, files: list[Path]):
if Path(f"{path}/temp_deps_cs").is_dir():
return
if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")):
- subprocess.run(["dotnet", "restore"], cwd=str(path))
+ result = subprocess.run(["dotnet", "restore"], cwd=str(path))
+ if result.returncode == 0:
+ Path(f"{path}/temp_deps_cs").mkdir(exist_ok=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def add_dependencies(self, path: Path, files: list[Path]): | |
| if Path(f"{path}/temp_deps_cs").is_dir(): | |
| return | |
| if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")): | |
| subprocess.run(["dotnet", "restore"], cwd=str(path)) | |
| def add_dependencies(self, path: Path, files: list[Path]): | |
| if Path(f"{path}/temp_deps_cs").is_dir(): | |
| return | |
| if any(Path(f"{path}").glob("*.csproj")) or any(Path(f"{path}").glob("*.sln")): | |
| result = subprocess.run(["dotnet", "restore"], cwd=str(path)) | |
| if result.returncode == 0: | |
| Path(f"{path}/temp_deps_cs").mkdir(exist_ok=True) |
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 26-26: Unused method argument: files
(ARG002)
[error] 30-30: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/analyzers/csharp/analyzer.py` around lines 26 - 30, The add_dependencies
function never creates the sentinel directory and never populates the files
list; modify add_dependencies to run subprocess.run(["dotnet","restore"],
cwd=str(path), check=True) only when a .csproj or .sln exists, and after a
successful restore create the sentinel
Path(f"{path}/temp_deps_cs").mkdir(parents=True, exist_ok=True) so the
idempotency guard can short-circuit future calls; also populate the files
argument (e.g., extend files with Path(path).glob("**/*.cs") or other C# source
patterns) so C# source files are queued for analysis and the files parameter is
actually used.
Summary by CodeRabbit
New Features
Documentation
Tests
Refactor