Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add support for JavaScript #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Uh oh!
There was an error while loading. Please reload this page.
Add support for JavaScript #59
Changes from all commits
754f4123ad7900fbc512dFile filter
Filter by extension
Conversations
Uh oh!
There was an error while loading. Please reload this page.
Jump to
Uh oh!
There was an error while loading. Please reload this page.
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code and misleading variable names in
add_symbolsfor class declarations.Two issues:
superclass_nodeis assigned but never used (confirmed by Ruff F841). Remove it.heritagebut it fetches the'body'field — misleading. It's also only used for theNonecheck on line 53, which is a defensive guard that's unlikely to trigger (classes without bodies are syntactically invalid JS).🔧 Proposed fix
def add_symbols(self, entity: Entity) -> None: if entity.node.type == 'class_declaration': - heritage = entity.node.child_by_field_name('body') - if heritage is None: + body = entity.node.child_by_field_name('body') + if body is None: return - superclass_node = entity.node.child_by_field_name('name') # Check for `extends` clause via class_heritage for child in entity.node.children:🧰 Tools
🪛 Ruff (0.15.1)
[error] 55-55: Local variable
superclass_nodeis assigned to but never usedRemove assignment to unused variable
superclass_node(F841)
🤖 Prompt for AI Agents
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI 9 days ago
General approach: ensure that any user‑supplied path is validated and constrained before it is used with filesystem APIs (
Path.resolve,Path.rglob,Repository(path)). A common pattern is to define a safe root directory (from configuration or an environment variable), resolve both the root and the requested path, and then verify that the requested path is inside the root (using.resolve()and a prefix / ancestor check). If the check fails, reject the request.Best fix with minimal behavior change:
In
api/analyzers/source_analyzer.py, add a small helper method onSourceAnalyzerto validate and normalize incoming paths:strorPath.analyze_local_folder, and exists foranalyze_local_repository.ValueError(orRuntimeError) on violation.Use this helper in:
analyze_local_folder: instead of passingPath(path)directly, call the validator, then use the returnedPathobject foranalyze_sources.analyze_local_repository: use the same validator to get a normalized, allowed repo path, then pass that to bothanalyze_local_folderandRepository(...).The endpoint in
tests/index.pyalready checksos.path.isdir(path), but that’s only used for tests. With the new validation inSourceAnalyzer, any other caller (such asapi/index.pyroutes that eventually callanalyze_local_folder/analyze_local_repository) also gets the protection.We can implement the helper purely inside
SourceAnalyzerusingPath.resolveandPath.is_relative_to(Python 3.9+) or atry: relative_tofallback. No new third‑party dependencies are needed; we’ll only add animport osinapi/analyzers/source_analyzer.pyif we choose to read an environment variable for the allowed root.Concretely:
_normalize_and_validate_path(self, path_str: str, must_be_dir: bool = True) -> Pathaboveanalyze_local_folder.analyze_local_folder, call this helper and pass the returnedPathtoanalyze_sourcesinstead of constructingPath(path)directly.analyze_local_repository, call the same helper withmust_be_dir=True, then use the resolvedPathboth foranalyze_local_folderandRepository(str(resolved_path)).This keeps existing functionality (scanning arbitrary directories) but ensures paths are absolute, normalized, and (optionally) within a configured safe root; if a root is not configured, we still normalize and ensure the path is a directory before traversing.
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI 9 days ago
General approach: constrain and validate user-provided paths before using them in filesystem operations. At minimum, decide on a safe root directory under which all analysis must occur, normalize the requested path, and ensure the normalized path is contained within that root. This also gives CodeQL a clear, recognizable mitigation pattern (normalize then prefix-check).
Best fix in this codebase without changing existing functionality more than necessary:
SourceAnalyzer.analyze_local_folder, convert the stringpathinto a normalizedPathobject, reject non-absolute or non-directory paths, and (crucially) enforce that the path lies within a configurable root directory. UsePath.resolve()and.relative_to()to ensure containment.Pathobject intoanalyze_sourcesinstead of constructing a newPathfrom the raw string.SourceAnalyzer(e.g., an environment-variable-controlled root or default to the current working directory), so that we do not change external APIs but still restrict analysis to a subtree.rglob, graph creation, etc.) unchanged.Concretely:
api/analyzers/source_analyzer.py:self.root_dir) inSourceAnalyzer.__init__to define the root directory from an environment variable likeCODE_GRAPH_ROOT_DIRor default to the process working directory (Path.cwd()), and resolve it.analyze_local_folder:pathtorequested_path = Path(path).resolve().requested_pathis a directory (requested_path.is_dir()).requested_pathis insideself.root_dirusingrequested_path.relative_to(self.root_dir)in atryblock; if it raisesValueError, log and raise an exception (or just log and return).self.analyze_sources(requested_path, ignore, g)rather than recreatingPath(path)inside.analyze_sources, keep the existingpath = path.resolve()andrglobusage; now the input has already been constrained to lie under a safe root, satisfying CodeQL’s recommendation while preserving the method’s behavior for internal callers.This fix addresses all variants of the alert because every path originating from HTTP (
tests/index.pyorapi/index.py) flows throughSourceAnalyzer.analyze_local_folderand then intoanalyze_sources, which will now only operate within the intended root directory.Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI 9 days ago
In general, to fix this kind of issue you must not let arbitrary user input select arbitrary filesystem roots. Instead, restrict paths to a safe base directory (or a fixed allow-list of roots) and/or treat the user-provided value only as a name within a controlled directory. This is done by (1) defining a safe root directory (for repositories or local folders), (2) constructing a candidate path by joining the root and the user input, (3) normalizing/resolving that path, and (4) verifying that the resolved path is still within the allowed root. If the check fails, return an error.
For this codebase, the best fix with minimal behavior change is:
SourceAnalyzerthat:pathstring and a base directoryPath.resolved = (base_dir / path).resolve().resolved.is_dir()and thatresolvedis insidebase_dirviaresolved.is_relative_to(base_dir)(Python 3.9+) or atry: resolved.relative_to(base_dir)fallback.analyze_local_folderbefore callinganalyze_sources. That way, every caller that passes a string path (including bothtests/index.pyandapi/index.pyflows) will be constrained to a configured base directory such as the current working directory or a specific environment-configurable root.Pathto callanalyze_sources, sopath.rglob(...)inanalyze_sourcesalways operates under the safe root.Concretely:
api/analyzers/source_analyzer.py:import os(standard library) since we’ll read an optional env var for the base root._resolve_and_validate_path(self, path: str) -> PathinsideSourceAnalyzerbeforeanalyze_local_folder.CODEGRAPH_BASE_DIRif present, otherwise default toPath.cwd().base_root = base_root.resolve().candidate = (base_root / path).resolve().candidate.is_dir()and that it is insidebase_root. If not, raiseValueError.analyze_local_folderto call this helper:self.analyze_sources(Path(path), ignore, g)withsafe_path = self._resolve_and_validate_path(path)and thenself.analyze_sources(safe_path, ignore, g).This keeps the public API of
SourceAnalyzerunchanged while ensuring that all filesystem walks start from a safe, controlled base directory and no longer directly trust arbitrary absolute/relative paths from HTTP requests.Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI 9 days ago
General approach: constrain user-controlled paths to a safe root, and normalize them before use. The analyzer should only traverse directories inside a configured “workspace root” (for example, an environment variable like
CODE_GRAPH_WORKSPACE_ROOTor the current working directory), and should reject inputs that escape that root. Normalization (viaPath.resolve()/os.path.realpath) must be done before checking containment.Best concrete fix with minimal behavior change:
SourceAnalyzerthat takes an input path string, resolves it to an absolutePath, and enforces that it lies under an allowed root directory.CODE_GRAPH_WORKSPACE_ROOT) if present, otherwise default to the current working directory (Path.cwd()), which is safe and requires no extra configuration.Path.resolve()on both the root and the user path.resolved_user_path == allowed_rootorallowed_root in resolved_user_path.parents. If not, log and raise aValueError.analyze_local_folderto call this helper instead of blindly wrappingpathwithPath(path). Pass the resulting safePathintoanalyze_sources.analyze_local_folder(fromtests/index.pyorapi/index.py) inherits the same validation without further changes to those files.All changes are limited to
api/analyzers/source_analyzer.py. We’ll need:import os(a well-known standard lib) to read the environment variable.SourceAnalyzer._resolve_and_validate_path(self, path: str) -> Path.analyze_local_folderto use that method and handle its result.Uh oh!
There was an error while loading. Please reload this page.