Skip to content

feat(graph): add MS Graph colon-syntax path lookup middleware#2688

Open
dschmidt wants to merge 7 commits into
opencloud-eu:mainfrom
dschmidt:feat/colon-path-middleware
Open

feat(graph): add MS Graph colon-syntax path lookup middleware#2688
dschmidt wants to merge 7 commits into
opencloud-eu:mainfrom
dschmidt:feat/colon-path-middleware

Conversation

@dschmidt
Copy link
Copy Markdown
Contributor

@dschmidt dschmidt commented Apr 27, 2026

Description

Adds a chi middleware that detects MS Graph colon-syntax URLs and rewrites them to the canonical /items/{itemID}/... form before chi performs route matching. Existing handlers, routes, and GetDriveAndItemIDParam stay unchanged.

Two URL shapes are recognized at both /v1.0 and /v1beta1:

/drives/{driveID}/root:/[:/][:]
/drives/{driveID}/items/{itemID}:/[:/][:]

Path resolution runs as the request user via CS3 Stat. Both NOT_FOUND and PERMISSION_DENIED collapse to a 404 response so existence isn't disclosed to unauthorized callers. URLs without colon syntax fast-path through with a single substring check. The original URL is stashed in request context under OriginalPathContextKey for downstream tracing/logging.

The middleware is registered as a top-level mux.Use so it runs before any route matching: chi middleware on a sub-router runs after the prefix is matched but cannot redirect to a different leaf route. Top-level middleware lets URL rewriting actually re-route the request.

Tests cover regex matching across versions, all rewrite variants (root/items anchored, with/without suffix, with/without trailing colon, deep paths), NOT_FOUND -> 404, PERMISSION_DENIED -> 404 (security: no existence disclosure), and original-URL preservation in request context.

Unfortunately chi itself is not able to match this kind of routes (gorilla/mux has a regex matcher that could be used).
So this is the least intrusive change I could come up with.

I haven't found a clean way to document this in OpenAPI/Libre Graph Spec - so I'm suggesting to just put it in the route comments there, if you accept this change

Related Issue

  • Fixes <issue_link>

Motivation and Context

It's annoying to implement path walking in a client and it's slower than a single internal cs3 path resolve call :)

How Has This Been Tested?

  • unit tests
  • manual curl tests

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation added

Adds a chi middleware that detects MS Graph colon-syntax URLs and rewrites
them to the canonical /items/{itemID}/... form before chi performs route
matching. Existing handlers, routes, and GetDriveAndItemIDParam stay
unchanged.

Two URL shapes are recognized at both /v1.0 and /v1beta1:

  /drives/{driveID}/root:/<path>[:/<suffix>][:]
  /drives/{driveID}/items/{itemID}:/<relativePath>[:/<suffix>][:]

Path resolution runs as the request user via CS3 Stat. Both NOT_FOUND and
PERMISSION_DENIED collapse to a 404 response so existence isn't disclosed
to unauthorized callers. URLs without colon syntax fast-path through with
a single substring check. The original URL is stashed in request context
under OriginalPathContextKey for downstream tracing/logging.

The middleware is registered as a top-level mux.Use so it runs before any
route matching: chi middleware on a sub-router runs after the prefix is
matched but cannot redirect to a different leaf route. Top-level
middleware lets URL rewriting actually re-route the request.

Tests cover regex matching across versions, all rewrite variants
(root/items anchored, with/without suffix, with/without trailing colon,
deep paths), NOT_FOUND -> 404, PERMISSION_DENIED -> 404 (security: no
existence disclosure), and original-URL preservation in request context.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a top-level chi middleware to support Microsoft Graph “colon-syntax” drive path addressing by resolving the path via CS3 Stat and rewriting requests to the canonical /drives/{driveID}/items/{itemID}... form before route matching.

Changes:

  • Register ResolveGraphPath as a top-level mux middleware so URL rewriting happens before chi route matching.
  • Implement colon-syntax detection, CS3 Stat-based path resolution, request path rewrite, and original-path preservation in context.
  • Add unit tests covering rewrite variants and “no existence disclosure” behavior for PERMISSION_DENIED.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
services/graph/pkg/service/v0/service.go Registers the new path-lookup middleware early in the mux pipeline.
services/graph/pkg/middleware/path_lookup.go Implements colon-syntax matching, CS3 Stat path resolution, rewrite logic, and context preservation.
services/graph/pkg/middleware/path_lookup_test.go Adds unit tests for matching/rewriting and security-related status behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/graph/pkg/middleware/path_lookup.go Outdated
Comment thread services/graph/pkg/middleware/path_lookup.go Outdated
Comment thread services/graph/pkg/middleware/path_lookup_test.go
Comment thread services/graph/pkg/middleware/path_lookup.go Outdated
Comment thread services/graph/pkg/middleware/path_lookup.go Outdated
- Drop double-decoding of URL path components. r.URL.Path is already
  decoded by net/http; calling url.PathUnescape again would let crafted
  inputs like "%252F" become "/", changing path semantics. Path and
  anchor-id strings now go straight to utils.MakeRelativePath /
  storagespace.ParseID.
- Distinguish operational errors from "not found". Gateway selection
  failure, RPC transport errors, and unexpected CS3 status codes now
  surface as 500 (don't mask outages); only NOT_FOUND and PERMISSION_DENIED
  collapse to 404 (no existence disclosure). Implemented via a sentinel
  errPathNotFound + a 500 fall-through.
- Refactor the two regex-handling branches in rewriteColonPath to share a
  resolution path via a small colonMatch struct + matchInto/extract helpers.
  Removes the SonarCloud duplication finding without changing behavior.
- Update the docstring on ResolveGraphPath to reflect that the rewrite
  preserves the requested API version (/{version}/...) rather than
  hard-coding /v1beta1/...
- Test cleanup: register t.Cleanup to remove the per-subtest selector
  entry from pool's global selectors map after the subtest, and update
  the "unexpected status" case to expect 500 (matches the new error
  semantics).
@dschmidt dschmidt requested a review from Copilot April 27, 2026 21:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/graph/pkg/middleware/path_lookup.go
Comment thread services/graph/pkg/middleware/path_lookup_test.go
Two follow-up Copilot review nits on the colon-syntax middleware:

- Don't blank r.URL.RawPath after the rewrite. Graph.ServeHTTP sets
  RawPath = EscapedPath() as a workaround for chi's parameter-binding
  quirks with `$`/`!` in IDs (see go-chi/chi#641). Clearing RawPath
  for rewritten requests negates that workaround. Re-establish the
  same invariant after the rewrite.

- Tests previously used log.NewLogger(), which mutates global zerolog
  state. Switch to log.NopLogger() — order-independent, no global
  side effects.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/graph/pkg/middleware/path_lookup.go
Comment thread services/graph/pkg/middleware/path_lookup.go Outdated
Comment thread services/graph/pkg/middleware/path_lookup.go
Three more Copilot review nits on the colon-syntax middleware:

- CS3 Stat returning UNAUTHENTICATED now surfaces as HTTP 401 (was 500
  via the default-case fallback). Distinct sentinel + handler branch.
- Item-anchored form (/drives/{driveID}/items/{itemID}:/...) now
  validates that driveID's storage/space prefix matches the itemID's,
  short-circuiting to 400 InvalidRequest on mismatch instead of doing
  a CS3 Stat that would only fail at the handler layer.
- ParseID failures on the anchor id now surface as 400 (was 404). In
  practice this branch is defensive — storagespace.ParseID is lenient
  and only errors on empty input, which the regex already filters out
  — but the right semantic for malformed client input is 400, not 404.

Tests: added two new cases (UNAUTHENTICATED → 401, drive/item id
mismatch → 400). The "malformed drive id" case Copilot suggested
isn't reachable in practice given ParseID's leniency, so it's not
covered.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dschmidt dschmidt force-pushed the feat/colon-path-middleware branch 2 times, most recently from d1bfb6d to c31b772 Compare May 6, 2026 10:52
Cover the rewrite shapes the middleware handles end-to-end against a
real OpenCloud server: root-anchored, item-anchored, deep paths,
trailing colon, and the "/<path>:/<suffix>" sub-route form. Also
assert that NOT_FOUND and PERMISSION_DENIED both collapse to 404.

The /permissions sub-route is registered only at /v1beta1, and the
v1beta1 GetDriveItem handler is share-jail-only, so the v1beta1
mount of the middleware is exercised through the permissions
scenario, since there is no other v1beta1 endpoint that works for
regular personal-drive items.
@dschmidt dschmidt force-pushed the feat/colon-path-middleware branch from c31b772 to adce092 Compare May 6, 2026 11:11
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 6, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
6.1% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@micbar
Copy link
Copy Markdown
Member

micbar commented May 6, 2026

@dschmidt Thanks!

@ScharfViktor I asked @dschmidt to add some API tests. Please review 😄

Copy link
Copy Markdown
Contributor

@ScharfViktor ScharfViktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good 👍

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 12, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 47 complexity · 2 duplication

Metric Results
Complexity 47
Duplication 2

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@codacy-production codacy-production Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a middleware to handle MS Graph's colon-syntax path lookups by performing internal CS3 Stat resolutions. Although the code is technically 'up to standards' according to Codacy, the review identifies significant risks: critical test scenarios for regex matching and error handling are missing, and there is no documentation for the new URL resolution behavior. The security requirement to collapse CS3 PERMISSION_DENIED errors into HTTP 404s must be verified to prevent sensitive path disclosure. Additionally, the high complexity (85) and presence of clones in services/graph/pkg/service/v0/service.go suggest potential maintainability issues in the service layer.

About this PR

  • Documentation was not provided for the new colon-syntax URL resolution behavior. This is a significant change in how paths are resolved and should be documented.

Test suggestions

  • Regex matching correctly identifies colon syntax for both v1.0 and v1beta1 versions
  • URL rewriting handles root-anchored paths with and without suffixes/trailing colons
  • URL rewriting handles item-anchored paths with relative paths
  • CS3 PERMISSION_DENIED results in an HTTP 404 response to prevent existence disclosure
  • CS3 NOT_FOUND results in an HTTP 404 response
  • Original path is correctly preserved in the request context for downstream use
  • URLs without colon syntax bypass the middleware logic quickly
Prompt proposal for missing tests
Consider implementing these tests if applicable:
1. Regex matching correctly identifies colon syntax for both v1.0 and v1beta1 versions
2. URL rewriting handles root-anchored paths with and without suffixes/trailing colons
3. URL rewriting handles item-anchored paths with relative paths
4. CS3 PERMISSION_DENIED results in an HTTP 404 response to prevent existence disclosure
5. CS3 NOT_FOUND results in an HTTP 404 response
6. Original path is correctly preserved in the request context for downstream use
7. URLs without colon syntax bypass the middleware logic quickly
Low confidence findings
  • The middleware is registered at a top-level. Ensure that it includes a fast-path for URLs not matching the colon syntax to avoid performance degradation for unrelated routes.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Copy link
Copy Markdown

@codacy-production codacy-production Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR successfully implements the MS Graph colon-syntax path lookup middleware, fulfilling all specified acceptance criteria. Codacy results indicate the PR is up to standards. However, a discrepancy exists in the URL rewriting logic: while the implementation notes that special characters like '$' and '!' must be percent-encoded for chi route matching, the current use of r.URL.EscapedPath() does not perform this encoding. This mismatch should be resolved to ensure robust routing. Additionally, services/graph/pkg/service/v0/service.go shows a significant increase in complexity and code duplication that should be monitored for technical debt.

About this PR

  • High complexity (85) and new code clones (5) were detected in services/graph/pkg/service/v0/service.go. Ensure these changes do not introduce excessive technical debt or maintainability issues in the service implementation.

Test suggestions

  • Verify that non-colon URLs pass through the middleware without modification.
  • Verify root-anchored colon path (e.g., /root:/Documents) rewrites to the canonical items URL.
  • Verify item-anchored colon path (e.g., /items/{id}:/subfolder) rewrites to the canonical items URL.
  • Verify that item-anchored requests return 400 if the item ID does not belong to the specified drive ID.
  • Verify handling of optional trailing colons and suffixes like /children or /createLink.
  • Verify that PERMISSION_DENIED from storage returns 404 (Security: no existence disclosure).
  • Verify that UNAUTHENTICATED from storage returns 401.
  • Verify that internal/unexpected CS3 errors return 500.
  • Verify that the original URL is correctly stashed in the request context.

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread services/graph/pkg/middleware/path_lookup.go
Codacy flagged the colon-path middleware comment claiming both `$`
and `!` need percent-encoding for chi's tree match, while the
implementation only calls r.URL.RawPath = r.URL.EscapedPath() which
does not encode either character per the suggestion's reading.

In practice EscapedPath does encode `!` as `%21` (only `?` is the
hardcoded escape, `!` is escaped because Go's net/url treats it as
needing encoding outside specific contexts). It leaves `$` literal,
which chi handles fine. chi.URLParam returns the encoded segment
verbatim, and downstream OpenCloud handlers (parseIDParam,
GetDriveAndItemIDParam) already PathUnescape before parsing IDs, so
the round-trip works end-to-end. The acceptance tests on this
branch already exercise this with real `$`/`!`-containing IDs.

Add a focused unit test that mounts the middleware behind chi,
sends a colon URL, and asserts the actual contract:
  - driveID (only `$`): chi.URLParam returns it literal
  - itemID (with `!`):  PathUnescape(chi.URLParam(...)) == original

Update the misleading comment so future readers (and reviewers) see
what the encoding actually does and which downstream contract it
relies on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants